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APOPTOSIS MODULATORS THAT INTERACT WITH THE 
HUNTINGTON'S DISEASE GENE 

BACKGROUND OF THE INVENTION 

This application relates to a family of apoptosis modulators that interact with the 
Huntington's Disease gene product, and to methods and compositions relating thereto. 

"Interacting proteins" are proteins which associate in vivo to form specific complexes. 
5 Non-covalent bonds, including hydrogen bonds, hydrophobic interactions and other 

molecular associations form between the proteins when two protein surfaces are matched or 
have affinity for each other. This affinity or match is required for the recognition of the two 
proteins, and the formation of an interaction. Protein-protein interactions are involved in the 
assembly of enzyme subunits; in antigen-antibody reactions; in forming the supramolecular 
10 structures of ribosomes, filaments, and viruses; in transport; and in the interaction of 
receptors on a cell with growth factors and hormones, 

Huntington's disease is an adult onset disorder characterized by selective neuronal loss 
in discrete regions of the brain and spinal chord that lead to progressive movement disorder, 
personality change and intellectual decline. From onset, which generally occurs around age 
15 40, the disease progresses with worsening symptoms, ending in death approximately 18 years 
after onset. 

The biochemical cause of Huntington's disease is unclear. While the biochemical 
cause of Huntington's disease has remained elusive, a mutation in a gene within chromosome 
4pl6.3 subband has been identified and linked to the disease. This gene, referred to as the 

20 Huntington's Disease or HD gene, contains two repeat regions, a CAG repeat region and a 

CCG repeat region. Testing of Huntington's disease patients has shown that the CAG region 
is highly polymorphic, and that the number of CAG repeat units in the CAG repeat region is a 
very reliable indicator of having inherited the gene for Huntington's disease. Thus, in control 
individuals and in most individuals suffering from neuropsychiatric disorders other than 

25 Huntington's disease, the number of CAG repeats is between 9 and 35, while in individuals 
suffering from Huntington's disease the number of CAG repeats is expanded and is 36 or 
greater. 
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To date, no differences have been observed at either the total RNA, mRNA or protein 
levels between normal and HD-affected individuals. Thus, the function of the HD protein 
and its role in the pathogenesis of Huntington's Disease remain to be elucidated. 



5 SUMMARY OF THE INVENTION 

We have now identified a protein designated as HIPl , that interact differently with 
the gene product of a normal (16 CAG repeat) and an expanded (>44 CAG repeat) HD gene. 
The HIPl protein originally isolated from a yeast two-hybrid screen is encoded by a 1.2 kb 
cDNA (Seq. ID. No. 1), devoid of stop codons, that is expressed as a 400 amino acid 

10 polypeptide (Seq. ID. No. 2). Subsequent study has elucidated additional sequence for HIPl 
such that a 1090 amino acid protein is now known. (Seq. ID No. 5). Expression of the HP 1 
protein was found to be enriched in the brain. 

Analysis of the sequence of the HIPl protein indicated that it includes a death effector 
domain (DED), suggesting an apoptotic function. Thus, it appears that a normal function of 

15 huntingtin may be to bind HIPl and related apoptosis modulators, reducing its effectiveness 
in stimulating cell death. Since expanded huntingtin performs this function less well, there is 
an increase in HIPl -modulated cell death in individuals with an expanded repeat in the HD 
gene. Furthermore, additional members of the same family of proteins have been identified 
which also contain a DED. Thus, the present invention provides a new class of apoptotic 

20 modulators which are referred to as HIP-apoptosis modulating proteins. 

This understanding of the likely role of huntingtin and HIPl or related proteins in the 
pathology of Huntington's Disease offers several possibilities for therapy. First, because the 
function of huntingtin apparently depends at least in part on the ability to interact with HIP- 
apoptosis modulating proteins, added expression (e.g., via gene therapy) of normal (non- 

,25 expanded) huntingtin or of the HIP-binding region of huntingtin should provide a therapeutic 
benefit. Other DED-interacting peptides could also be used to mask and reduce the 
interaction of HIP-apoptosis modulating proteins with the death signaling complex. 
Alternatively, a mutant form of HIP-protein from which the DED has been deleted might be 
introduced, for example using gene therapy techniques. Because HIP-apoptosis modulating 

30 proteins have been shown to self-associate, a protein with a deleted DED may compete with 
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endogenous HIP-protein in the fonnation of these associations, thereby reducing the amount 
of apoptotically-active HIP-protein. 

BRIEF DESCRIPTION OF THE DRAWING 
5 Fig. 1 graphically depicts the amount of interaction between HIPl and Huntingtin 

proteins with varying lengths of polyglutamine repeat; 

Fig. 2 compares the nucleic acid sequences of human and murine HIPl and HIP la; 
Fig. 3 compares the amino acid sequences of human and murine HIPl and HIPl a; 
Fig. 4 shows the sequences of various death effector domains in comparison to the 
1 0 DED of human and murine HIP 1 and HIP 1 a; 

Fig. 5 shows the genomic organization of human HIPl ; 

Fig. 6 compares the sequences of human HIPl with ZK370.3 protein of C elegans; 
Fig, 7 shows mouse EST's with homology to human HIPl cDNA used to screen a 
mouse brain library; 

15 Fig. 8 shows the affect of HIPl on susceptibility of cells to stress; and 

Figs. 9 A - 9C show the toxicity of HIPl in the presence of huntingtin with different 
lengths of polyglutamine repeats. 

DETAILED DESCRIPTION OF THE INVENTION 

20 This application relates to a new family of proteins function as modulators of apop- 

tosis. At least some of these proteins, notably the human protein designated HIPl, interact 
with the gene product of the Huntington's disease gene. Other proteins within the family 
possess at least 40% and preferably more than 50% nucleotide identity with HIPl and include 
a death effector domain (DED) . Such proteins are referred to in the specification and claims 

25 hereof as "HIP-apoptosis modulating proteins." 

The first HIP-apoptosis modulating protein identified was designated as HIPl . HIPl 
was identified using the yeast two-hybrid system described in US Patent No. 5,283,173 which 
is incorporated herein by reference. Briefly, this system utilizes two chimeric genes or 
plasmids expressible in a yeast host. The yeast host is selected to contain a detectable marker 

30 gene having a binding site for the DNA binding domain of a transcriptional activator. The 
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first chimeric gene or plasmid encodes a DNA-binding domain which recognizes the binding 
site of the selectable marker gene and a test protein or protein fragment. The second chimeric 
gene or plasmid encodes for a second test protein and a transcriptional activation domain. 
The two chimeric genes or plasmids are introduced into the host cell and expressed, and the 
5 cells are cultivated. Expression of the detectable marker gene only occurs when the gene 
product of the first chimeric gene or plasmid binds to the DNA binding domain of the 
detectable marker gene, and a transcriptional activation domain is brought into sufficient 
proximity to the DNA-binding domain, an occurrence which is facilitated by protein-protein 
interactions between the first and second test proteins. By selecting for cells expressing the 

10 detectable marker gene, those cells which contain chimeric genes or plasmids for interacting 
proteins can be identified, and the gene can be recovered and identified. 

In testing for Huntington Interacting Proteins, several different plasmids were 
prepared containing portions of the human HD gene. The first four, identified as 16pGBT9, 
44pGBT9, 80pGBT9 and 128pGBT9, were GAL4 DNA binding domain-HD in-frame 

15 fusions containing nucleotides 314 to 1955 (amino acids 1-540) of the published HD cDNA 
sequences cloned into the vector pGBT9 (Clontech), These plasmids contain a CAG repeat 
region of 16, 44, 80 and 128 glutamine-encoding repeats, respectively. A clone (DMK 
BamHIpGBT9) was made by fusing a cDNA encoding the first 544 amino acids of the 
myotonic dystrophy gene (a gift from R. Komeluk) in-frame with the GAL4-DNA BD of 

20 pGBT9 and was used as a negative control. 

These plasmids have been used to identify and characterize HIPl, as well as two 
additional HD-interacting proteins, HIP2 and HIP3, which have not yet been tested for 
function as apoptosis modulators. These plasmids can be further used for the identification of 
additional interacting proteins which do act as apoptosis modulators, and for tests to refine 

25 the region on the protein in which the interaction occurs. Thus, one aspect of the invention is 
these four plasmids, and the use of these plasmids in identifying HD-interacting proteins. 
Furthermore, it will be appreciated that the GAM DNA-binding and activating domains are 
not the only domains which can be used in the yeast two-hybrid assay. Thus, in a broader 
sense, the invention encompasses any chimeric genes or plasmids containing nucleotides 314 

30 to 1955 of the HD gene together with an activating or DNA-binding domain suitable for use 
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in the yeast one, two- or three-hybrid assay for proteins critical in either binding to the HD 
protein or responsible for regulated expression of the HD gene. 

After introducing the plasmids into Y190 yeast host cells, transforming the host cells 
with an adult human brain Matchmaker^'^ (Clontech) cDNA library coupled with a GAL4 
5 activating domain, and selecting for the expression of two detectable marker genes to identify 
clones containing genes for interacting proteins, the activating domain plasmids were 
recovered and analyzed. As a result of this analysis, three different cDNA fragments were 
identified as encoding for HD-interacting proteins and designated as HIPl, HIP2 and HIP3. 
The nucleic acid sequence of HIPl, as originally recovered in the yeast two-hybrid assay, is 

10 given in Seq. ID. No 1. The polypeptide which it encodes is given by Seq. ID No. 2. Further 
investigation of the HIPl cDNA resulted in the characterization of a longer region of cDNA 
totaling 4795 bases and a corresponding protein, the sequences of which are given by Seq ID 
Nos. 3 and 4, respectively. A further portion of the HIPl protein was characterized, 
extending the length to the complete protein sequence of 1090 amino acids (Seq. ID No. 5) 

15 The cDNA molecules encoding HIP-apoptosis modulating proteins, particularly those 

encoding portions of HIPl, can be explored using oligonucleotide probes for example for 
amplification and sequencing. In addition, oHgonucleotide probes complementary to the 
cDNA can be used as diagnostic probes to localize and quantify the presence of HIPl DNA. 
Probes of this type with a one or two base mismatch can also be used in site-directed 

20 mutagenesis to introduce variations into the HIPl sequence which may increase or decrease 
the apoptotic activity. Preferred targets for such mutations would be the death effector 
domains. Thus, a further aspect of the present invention is an oligonucleotide probe, 
preferably having a length of from 1 5-40 bases which specifically and selectively hybridizes 
with the cDNA given by Seq. ID No. 1 or 3 or a sequence complementary thereto. As used 

25 herein, the phrase "specifically and selectively hybridizes with" the cDNA refers to primers 
which will hybridize with the cDNA under stringent hybridization conditions. 

Probes of this type can also be used for diagnostic purposes to characterize risk of 
Huntington's Disease hke symptoms arising in individuals where the symptoms are present in 
the family history but are not associated with an expansion of the CAG repeat. Such 

30 symptoms may arise from a mutation in HIPl or other HIP-apoptosis modulating protein 
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which ahers the interaction of this protein with huntingtin, thereby increasing the apoptotic 
activity of the protein even in the presence of a normal (non-expanded) huntingtin molecule. 
An appropriate probe for this purpose would one which hybridizes with or adjacent to the 
huntingtin binding region of the HIP-apoptosis modulating protein. In HIPl, this lies within 
5 amino acids 129-514. 

DNA sequencing of the HIPl cDNA initially isolated from the yeast two-hybrid 
screen (Seq. ID No. 1) revealed a 1.2 kb cDNA that shows no significant degree of nucleic 
acid identity with any stretch of DNA using the blastn program at ncbi 
(blast@ncbi.nlm.nih.gov). When the larger HIPl cDNA sequence (SEQ ID NO. 3) was 

10 translated into a polypeptide, the HIPl cDNA coding (nucleotides 328-3069) is observed to 
be devoid of stop codons, and to produce a 914 amino acid polypeptide. A polypeptide 
identity search revealed an identity match over the entire length of the protein (46% 
conservation) with that of a hypothetical protein from C elegans (ZK370.3 protein; C. 
elegans cosmid ZK370). This C elegans protein shares identity with the mouse talin gene, 

1 5 which encodes a 2 1 7 kDa protein implicated with maintaining integrity of the cytoskeleton. 
It also shares identity with the SLA2/MOP2/ END4 gene from Saccharomyces cerevisiae, 
which is known to code for an essential cytoskeletal associated gene required for the 
accumulation and or maintenance of plasma membrane H"*- ATPase on the cell surface. 
When pairwise comparisons are performed between HIPl and the C. elegans ZK370.3 protein 

20 (Genpept accession number celzk370.3), it shows 26% complete identity and an overall 46% 
level of conservation. Comparative analysis between HIPl and SLA2/MOP2/ END4 (EMBL 
accession number Z2281 1) demonstrate similar conservation (20% identity, 40%) 
conservation). 

Further exploration revealed several important facts about HIPl that implicate it in a 
25 significantly in the pathogenesis of Huntington's Disease. First, as shown in Fig. 1, it was 
found that the native interaction between HD protein and HIPl is influenced by the number 
of CAG repeats. Second, it was found that expression of the HIPl protein is enriched in the 
brain. The highest amounts of expression are in the cortex, with lower levels being seen in 
the cerebellum, caudate and putamen. 
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It has also been observed that huntingtin proteins with expanded polyglutamine tracts 
can aggregate into large, irregularly shaped deposits in HD brains, transgenic mice and in 
vitro cell culture. We have shown that in HEK (human embryonic kidney) 293T cells, the 
aggregation of full-length and smaller huntingtin fragments occurs after the cells have been 
5 exposed to a period of apoptotic stress. Martindale, et al., Nature Genetics 18: 1 50-1 54 
(1998). In order to assess the consequence of HIP 1 expression in cultured cells, we used 
huntingtin aggregation as one marker of viability. What we found was that cells 
cotransfected with huntingtin (128 CAG repeats) and HIPl contained aggregates comparable 
to those observed following application of apoptotic stress with sub-lethal doses of tamoxifen 

10 in 14% of the cells, and that these cells were the ones in which both genes had been 

introduced as reflected by a double marker experiment. Transfection of a gene encoding a 
fusion protein of 128 repeat huntingtin and the DED domain from HIPl ligated in the sense 
orientation resulted in aggregate formation in 30 to 50% of the cells. 

The implications of the apoptotic activity of HIPl are two-fold. First, the fact that 

15 this activity is apparently differentially modulated by interaction with huntingtin having 
normal and expanded repeats implicates HIPl in the apoptotic neuronal death which is 
observed in Huntington's disease and makes HIPl a logical target for therapy. A second 
implication of the apoptotic activity of HIPl is the potential for use of HIPl as a therapeutic 
agent to introduce apoptosis in cancer cells. 

20 Therapeutic targeting of HIPl or other HIP-apoptosis modulating proteins might take 

any of several forms, but will in general be a treatment involving administration of a 
composition that reduces the apoptotic activity of the HIP-apoptosis modulating protein. As 
used in the specification and claims hereof, the term "administration" includes direct 
administration of a composition active to reduce apoptotic activity as well as indirect 

25 administration which might include administration of pro-drugs or nucleic acids that encode 
the desired therapeutic composition. 

One class of composition which can be used in the therapeutic methods of the 
invention are those compositions which interfere with the activity of HIP-apoptosis 
modulating proteins by binding to the proteins and mask and reduce the interaction of HIP- 

30 apoptosis modulating proteins with the death signaling complex. Within this class of 
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compositions are normal (non-expanded) huntingtin, administered, for example, via increased 
expression of exogenous HD genes; the HlP-binding region of huntingtin, administered via 
gene therapy techniques; and other DED-interacting peptides. Other DED-interacting 
peptides which might be used in a therapeutic method of this type include FADD (Beldin et 
5 aL, Cell 85: 803-815 (1996)) and caspase 8 (Muzio et al., Cell 85: 817-827 (1996). 

An alternative form of therapy involves the use of a mutant form of HIP! or other 
HIP-apoptosis modulating protein from which the DED has been deleted. DED-containing 
proteins, including HIPl are self-associating, and this self-association has been shown to be 
important for activity. (Muzio et al., Cell 85; 817-827 (1996). Thus, a protein with a deleted 

10 DED may compete with endogenous HlP-protein in the formation of these associations, 
thereby reducing the amount of apoptotically-active HIP-protein. 

In addition to HIPl, we have identified a further human protein, HIP la, from a 
human frontal cortex cDNA library. HP la is a family member of HIPl , and thus a HIP- 
apoptosis modulator in accordance with the invention. A partial sequence of HIPla (the 5' 

15 portion of HIPla remains to be characterized) is given by SEQ ID Nos. 6 and 7. The isolated 
and characterized portion of HIPla shows 53% nucleotide identity and 58% amino acid 
conservation with HIPl (Table 1, Figs. 2 and 3). 

We have also isolated 2 mouse proteins mHIPl and mHIPl a (SEQ. ID Nos. 8-11) 
which appear to be the murine homologues of human HIPl and HIPla. As in the case of 

20 human HIP 1 a, the 5' portion of mHIP 1 remains to be isolated. At present, mHIP 1 shows 85% 
nucleotide identity and 90% amino acid conservation with huHIPl (Table I, Figs. 2 and 3). 
mHIP la shows 60% nucleotide identity and 61% amino acid conservation with huHIPl 
(Table 1, Figs. 2 and 3). mHIP la shows stronger homology to huHIPla; it shows 87% 
nucleotide identity and 91% amino acid conservation with huHIPla (Table 1, Figs. 2 and 3). 

25 Taken together these findings indicate that mHIPl is the murine homologue of huHIPl 

whereas mHIPla is most likely the murine horaologue of huHIPla. As mentioned previously, 
HIPl shows sequence similarity to Sla2p in S. cerevisiae and the hypothetical protein 
ZK370.3 in C. elegans. Similarly, huHIPla, mHIPl, and mHIPla show sequence similar to 
Sla2p and ZK370.3 (Table 2). The carboxy-terminal regions of huHIPla, mHIPl, and 

30 mHIP 1 a all show considerable homology to the manunalian membrane 
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cytoskeletal-associated protein, talin. This suggests that these 3 proteins may also play a role 
in the regulation of membrane events through interactions with the underlying cytoskeleton. 

HIPl contains a death effector domain (DED), a domain which is also present in a 
number of proteins involved in the apoptotic pathway (Fig. 4). This suggests that HIPl may 
5 act as a modulator of the apoptosis pathway. The DED in huHIPl is present between amino 
acid positions 287 and 368. Similarly, HIPl a, mHIPl, and mHIPl a also contain a DED. In 
huHIPIa the DED is present at amino acids 1-78 of the recovered fragment. In mHIPl and 
mHIPla, the DED are present at amino acids 128- 210 and 388-470, respectively. The DED 
present in huHIPl a, mHIPl and mHPla all show significant percentage amino acid 

10 conservation to the DED present in huHIPl (Table 3). 

Increasing expression of normal (non-expanded) huntingtin or the HIP-apoptotic 
modulator-binding portion thereof, a modified HIP-apoptotic modulator in which the DED 
has been deleted or of a DED-interacting protein or peptide can be accomplished using gene 
therapy approaches. In general, this will involve introduction of DNA encoding the 

15 appropriate protein or peptide in an expressable vector into the brain cells. Expression of 
HIP-apoptosis modulating proteins may also be useful in treatment of cancer in which case 
application to other cell types would be desired, and cells expressing HIP-apoptosis 
modulating proteins may be used for screening of therapeutic compounds. Thus, in a more 
general sense, expression vectors are defined herein as DNA sequences that are required for 

20 the transcription of cloned copies of genes and the translation of their mRNAs in an 

appropriate cell type. Specifically designed vectors allow the shuttling of DNA between 
hosts such as bacteria-yeast or bacteria-animal cells. An appropriately constructed expression 
vector may contain: an origin of repHcation for autonomous replication in host cells, 
selectable markers, a limited number of useful restriction enzyme sites, a potential for high 

25 copy number, and active promoters. A promoter is defined as a DNA sequence that directs 
RNA polymerase to bind to DNA and initiate RNA synthesis. A strong promoter is one 
which causes mRNAs to be initiated at high frequency. Expression vectors may include, but 
are not limited to, cloning vectors, modified cloning vectors, specifically designed plasmids 
or viruses. 

30 A variety of mammalian expression vectors may be used to express recombinant 
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HIP-apoptosis modulating proteins or fragments thereof in mammalian cells. Commercially 
available mammalian expression vectors which may be suitable for recombinant HP - 
apoptosis modulating protein expression, include but are not limited to, pMClneo 
(Stratagene), pXTl (Stratagene), pSG5 (Stratagene), EBO-pSV2-neo (ATCC 37593) 
pBPV-l(8-2) (ATCC 371 10), pdBPV-MMTneo(342-12) (ATCC 37224), pRSVgpt (ATCC 
37199), pRSVneo (ATCC 37198), pSV2-dhfr (ATCC 37146), pUCTag (ATCC 37460), and 
1ZD35 (ATCC 37565). Other vectors which have been shown to be suitable expression 
systems in mammalian cells include the herpes simplex viral based vectors: pHSVl (Geller 
etal Proc. Natl. Acad. Sci 87:8950-8954 (1990)); recombinant retroviral vectors; MFC 
(Jaffeeetal. Cancer Res. 53:2221-2226 (1993)); Moloney-based retroviral vectors: LN, 
LNSX. LNCX, LXSN (Miller and Rosman Biotechniques 7:980-989 (1989)); vaccinia viral 
vector: MVA (Sutter and Moss Proc. NatL Acad. Sci, 89:10847-10851 (1992)); recombin- 
ant adenovirus vectors : pJM17 (Ali et al Gene Therapy 1 :367-384 (1994)), (Berkner K. L. 
Biotechniques 6:616-624 1988); second generation adenovirus vector: DE1/DE4 adenoviral 
vectors (Wang and Finer Nature Medicine 2:714-716 (1996) ); and Adeno-associated viral 
vectors: AAV/Neo (Muro-Cacho et al. J. Immunotherapy 11:231-237(1992)). 

The expression vector may be introduced into host cells via any one of a number of 
techniques including but not limited to transformation, transfection, infection, protoplast 
fusion, and electroporation. The expression vector-containing cells are clonally propagated 
and individually analyzed to determine whether they produce the desired protein. Dehvery of 
retroviral vectors to brain and nervous system tissue has been described in US Patents Nos, 
4,866,042, 5,082,670 and 5,529,774, which are incorporated herein by references. These 
patents disclose the use of cerebral grafts or implants as one mechanism for introducing 
vectors bearing therapeutic gene sequences into the brain, as well as an approach in which the 
vectors are transmitted across the blood brain barrier. 

To further illustrate the methods of making the materials which are the subject of this 
invention, and the testing which has established their utility, the following non-limiting 
experimental procedures are provided. 
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EXAMPLE I 

IDENTIFICATION OF INTERACTING PROTEINS 
GAL4'HD cDNA constructs 

An HD cDNA construct (44pGBT9), with 44 CAG repeats was generated 
5 encompassing amino acids 1 - 540 of the pubhshed HD cDNA . This cDNA fragment was 
fused in frame to the GAL4 DNA-binding domain (BD) of the yeast two-hybrid vector 
pGBT9 (Clontech). Other HD cDNA constructs, 16pGBT9, 80pGBT9 and 128pGBT9 were 
constructed, identical to 44pGBT9 but included only 16, 80 or 128 CAG repeats, 
respectively. 

10 Another clone (DMKDBamHIpGBT9) containing the first 544 amino acids of the 

myotonic dystrophy gene (a gift from R. Komeluk) was fused in-frame with the GAL4-DNA 
BD of pGBT9 and was used as a negative control Plasmids expressing the GAL4-BDRAD7 
(D. Gietz, unpublished) and SIR3 were used as a positive control for the p-galactosidase filter 
assay. 

15 The clones IT15-23Q, IT15-44Q and HAPl were generous gifts from Dr. C. Ross. 

These clones represent a previously isolated huntingtin interacting protein that has a higher 
affinity for the expanded form of the HD protein. 

Yeast strains, transformations and (3-galactosidase assays 
20 The yeast strain Y190 (MATa leu2-3,l 12, ura3-52, trpl-901, his3-A200, ade2-101 , 

gal4Agal80A, URA3::GAL-lacZ, LYS2::GAL-HIS3,cycO was used for all transformations 
and assays. Yeast transformations were performed using a modified lithium acetate 
transformation protocol and grown at 30 C using appropriate synthetic complete (SC) dropout 
media. 

25 The P-galactosidase chromogenic filter assays were performed by transferring the 

yeast colonies onto Whatman filters. The yeast cells were lysed by submerging the filters in 
liquid nitrogen for 15-20 seconds. Filters were allowed to dry at room temperature for at 
least five minutes and placed onto filter paper presoaked in Z-buffer (100 mM sodium 
phosphate (pH7.0) 10 mM KCl, 1 mM MgS04) supplemented with 50 mM 
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2-Tnercaptoethanol and 0.07 mg/ml 5-broino-4-chloro-3-indolyl p-D-galactoside (X-gal). 
Filters were placed at 37 C for up to 8 hours. 

Yeast two-hvbrid screening for huntingtin interacting protein (HIP) 
5 cDNAs from an human adult brain Matchmaker™ cDNA library (Clontech) was 

transformed into the yeast strain Y190 already harboring the 44pGBT9 construct. The 
transformants were plated onto one hundred 150 mm x 15 mm circular culture dishes 
containing SC media deficient in Trp, Leu and His. The herbicide 3-amino-triazole (3-AT) 
(25mM) was utilized to limit the number of false His+ positives (31). The yeast 

10 transformants were placed at 30 C for 5 days and P-galactosidase filter assays were performed 
on all colonies found after this time, as described above, to identify p-galactosidase+ clones. 
Primary His+/p-galactosidase+ clones were then orderly patched onto a grid on SC 
-Trp/-Leu/-His (25 mM 3 AT) plates and assayed again for His+ growth and the ability to turn 
blue with a filter assay. Secondary positives were identified for further analysis. Proteins 

15 encoded by positive cDNAs were designated as HIPs (Huntingtin Interactive Proteins). 

Approximately 4.0 x 10' Trp/Leu auxotrophic transformants were screened and of 14 clones 
isolated 12 represented the same cDNA (HIPl), and the other 2 cDNAs, HIP2 and HIP3 were 
each represented only once. 

The HIP cDNA plasmids were isolated by growing the His+/p-galactosidase+ colony 

20 in SC -Leu media overnight, lysing the cells with acid-washed glass beads and 

electroporating the bacterial strain, KC8 (leuB auxotrophic) with the yeast lysate. The KC8 
ampicillin resistant colonies were replica plated onto M9 (-Leu) plates. The plasmid DNA 
from M9+ colonies was transformed into DH5-a for further manipulation. 

25 EXAMPLE 2 

CONFIRMATION OF INTERACTIONS 
The HIP1-GAL4-AD cDNA activated both the lac-Z and His reporter genes in the 
yeast strain Y190 only when co-transformed with the GAL4-BD-HD construct, but not the 
negative controls (Fig. 1 ) of the vector alone or a random fusion protein of the myotonin 
30 kinase gene. In order to assess the influence of the polyglutamine tract on the interaction 
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between HIP I and HD, semi-quantitative P-galactosidase assays were performed. 
GAL4-BD-HD fusion proteins with 16, 44, 80 and 128 glutamine repeats were assayed for 
their strength of interaction with the GAL4- AD-HIP 1 fusion protein. 

Liquid P-galactosidase assays were performed by inoculating a single yeast colony 
into appropriate synthetic complete (SC) dropout media and grown to OD600 0.6-1 .5. Five 
millilitres of overnight culture was pelleted and washed once with 1 ml of Z-Buffer, then 
resuspended in 100 ml Z-Buffer supplemented with 38 mM 2-mercaptoethanol, and 0.05% 
SDS. Acid washed glass beads (-100 ml) were added to each sample and vortexed for four 
minutes, by repeatedly alternating a 30 seconds vortex, with 30 seconds on ice. Each sample 
was pelleted and 10 ml of lysate was added to 500 ml of lysis buffer. The samples were 
incubated in a 30 C waterbath for 30 seconds and then 100 ml of a 4 mg/ml o-nitrophenyl 
b-D galactopyranoside (ONPG) solution was added to each tube. The reaction was allowed 
to continue for 20 minutes at 30 C and stopped by the addition of 500 ml of 1 M NajCOa and 
placing the samples on ice. Subsequently, OD420 was taken in order to calculate the P- 
galactosidase activity with the equation 1000 x OD420/(t x V x OD600) where t is the 
elapsed time (minutes) and V is the amount of lysate used. 

The specificity of the HIPl-HD interaction can be observed using the chromogenic 
filter assay. Only yeast cells harboring HIPl and HD activate both the HIS and lacZ reporter 
genes in the Y190 yeast host. The cells that contain the HIPl with HD constructs with 80 or 
128 CAG repeats turn blue approximately 45 minutes after the cells with the smaller sized 
repeats (16 or 44), 

No difference in the P-galactosidase activity was observed between the 16 and 44 
repeats or between the 80 and 128 repeats. However, a significant difference (p<0.05) in 
activity is seen between the smaller repeats (16 and 44) and the larger repeats (80 and 128). 
(Figure 1) 

EXAMPLE 3 

DNA SEQUENCING. cDNA ISOLATION AND 5' RACE 
Oligonucleotide primers were synthesized on an ABI PCR-mate oligo-synthesizer. 
DNA sequencing was performed using an ABI 373 fluorescent automated DNA sequencer. 
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The HIP cDNAs were confirmed to be in- frame with the GAL4-AD by sequencing across the 
AD-HIP I cloning junction using an AD oligonucleotide (5'GAA GAT ACC CCA CCA 
AAC3*). (Seq. ID No. 12) 

Subsequently, primer walking was used to determine the remaining sequences. A 
5 human frontal cortex >4.0 kb cDNA library (a gift from S. Mental) was screened to isolate 
the full length HIPl gene. Fifty nanograms of a 558 base pair Eco RI fragment from the 
original HIPl cDNA was radioactively labeled with [a^^P]-dCTP using nick-translation and 
the probe allowed to hybridized to filters containing >105 pfu/ml of the cDNA library 
overnight at 65 °C in Church buffer (see Northern blot protocol). The filters were washed at 

10 65^C for 10 minutes with 1 X SSPE, 15 minutes at 65 C with 1 X SSPE and 0.1% SDS, then 
for thirty minutes and fifteen minutes with I X SSPE and 0.1% SDS. The filters were 
exposed to X-ray film (Kodak, XAR5) overnight at -70 C. Primary positives were isolated 
and replated and subsequent secondary positives were hybridized and washed as for the 
primary screen. The resulting positive phage were converted into plasmid DNA by 

15 conventional methods (Stratagene) and the cDNA isolated and sequenced. 

In order to obtain the most 5' sequence of the HIPl gene, a Rapid Amplification of 
cDNA Ends (RACE) protocol was performed according to the manufacturers 
recommendations (BRL). First strand cDNA was synthesized using the oligo HIPI-242R (5' 
GCT TGA CAG TGT ACT CAT AAA GGT GGC TGC AGT CC 3'). (Seq. ID No. 13) 

20 After dCTP tailing the cDNA with terminal deoxy transferase, two rounds of 35 cycles 
(94°C 1 minute; 53°C 1 minute; 72T 2 minutes) of PCR using HIP1-R2 (5' GGA CAT 
GTC CAG GGA GTT GAA TAC 3') (Seq. ID No. 14) and an anchor primer (5^ (CUA)4 
GGC CAC GCG TCG ACT AGT ACG GGI IGG GII GGG IIG3') (BRL ,Seq. ID No. 15)) 
were performed. The subsequent 650 base pair PCR product was cloned using the TA 

25 cloning system (Invitrogen) and sequenced using T3 and T7 primers. Sequences ID Nos. 1 
and 3 show the sequence of the HIPl cDNAs obtained. 
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EXAMPLE 4 
DNA AND AMINO ACID ANALYSES 
Overlapping DNA sequence was assembled using the program Mac Vector and sent 
via email or Netscape to the BLAST server at NIH (http://www.ncbi.nlm.nih.gov) to search 
5 for sequence similarities with known DNA (blastn) or protein (tblastn) sequences. Amino 
acid alignments were performed with the program Clustalw. 

EXAMPLE 5 

FISH DETECTION SYSTEM AND IMAGE ANALYSIS 
10 The HIPl cDNA isolated from the two-hybrid screen was mapped by fluorescent in 

situ hybridization (FISH) to normal human lymphocyte chromosomes counterstained with 
propidium iodide and DAPI. Biotinylated probe was detected with avidin-fluorescein 
isothiocyanate (FITC). Images of metaphase preparations were captured by a 
thermoelectrically cooled charge coupled camera (Photometries). Separate images of DAPI 
15 banded chromosomes and FITC targeted chromosomes were obtained. Hybridization signals 
were acquired and merged using image analysis software and pseudo colored blue (DAPI) 
and yellow (FITC) as described and overlaid electronically. This study showed that HIPl 
maps to a single genomic locus at 7ql 1.2. 

20 EXAMPLE 6 

NORTHERN BLOT ANALYSIS 
RNA was isolated using the single step method of homogenization in guanidinium 
isothiocyante and fractionated on a 1.0% agarose gel containing 0.6 M formaldehyde. The 
RNA was transferred to a hybond N -membrane (Amersham) and crosslinked with ultraviolet 
25 radiation. 

Hybridization of the Northern blot with b-actin as an internal control probe 
provided confimiation that the RNA was intact and had transferred. The 1.2 kb HIPl cDNA 
was labeled using nick translation and incorporation of a^^P-dCTP. Hybridization of the 
original 1.2 kb HIPl cDNA was carried out in Church buffer (0.5 M sodium phosphate 
30 buffer, pH 7.2, 2.7% sodium dodecyl sulphate, 1 mM EDTA) at 55 C overnight. Following 
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hybridization, Northern blots were washed once for 10 minutes in 2,0 X SSPE, 0.1% SDS at 
room temperature and twice for 10 minutes in 0.15 X SSPE, 0.1% SDS, Autoradiography 
was carried our from one to three days using Hyperfilm (Amersham) film at -70 C. 

Analysis of the levels of RNA levels of HIP 1 by Northern blot data revealed that the 
10 kilo base HIPl message is present in all tissue assessed. However, the levels of RNA are 
not uniform, with brain having highest levels of expression and peripheral tissues having less 
message. No apparent differences in RNA expression was noted between control samples 
and HD affected individuals. 

EXAMPLE 7 
TISSUE LOCALIZATION OF HTPl 
Tissue locaHzation of HIPl was studied using a variety of techniques as described 
below. Subcellular distribution of HIP- 1 protein in adult human and mouse brain Biochem- 
ical fractionation studies revealed the HIPl protein was found to be a membrane-associated 
protein. No inmiunoreactivity was seen by Western blotting in cytosolic fractions, using the 
anti-HIPl-pepl polyclonal antibody. HIPl immunoreactivity was observed in all membrane 
fractions including nuclei (PI), mitochondria and synaptosomes (P2), microsomes and 
plasma membranes (P3). The P3 fraction contained the most HIPl compared to other 
membrane fi-actions. HIPl could be removed from membranes by high salt (O.SMNaCl) 
buffers indicating it is not an integral membrane protein, however, since low salt (0.1- 0.25M 
NaCl) was only able to partially remove HIPl from membranes, its membrane association is 
relatively strong. The extraction of P3 membranes with the non-ionic detergent, Triton 
X-100 revealed HIPl to be a Triton X-100 insoluble protein. This characteristic is shared by 
many cytoskeletal and cytoskeletal-associated membrane proteins including actin, which was 
used as a control in this study. The biochemical characteristics of HIPl described were found 
to be identical in mouse and human brain and was the same for both forms of the protein 
(both bands of the HIPl doublet). HIPl co-localized with huntingtin in the P2 and P3 
membrane fractions, including the high-sak membrane extractions, as well as in the Triton 
X-100 insoluble residue. The subcellular distribution of HIPl was unaffected by the 
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expression of polyglutamine-expanded huntingtin in transgenic mice and HD patient brain 
samples. 

The localization of HIP 1 protein was further investigated by immunohistochemistry in 
normal adult mouse brain tissue. Immunoreactivity was seen in a patchy, reticular pattern in 
5 the cytoplasm, appeared excluded from the nucleus and stained most intensely in a 

discontinuous pattern at the membrane. These results are consistent with the association of 
HIPl with the cytoskeletal matrix and further indicate an enrichment of HIP 1 at plasma 
membranes. Immunoreactivity occurred in all regions of the brain, including cortex, 
striatum, cerebellum and brainstem, but appeared most strongly in neurons and especially in 
10 cortical neurons. As described previously, huntingtin immunoreactivity was seen exclusively 
and uniformly in the cytosol. 

The in situ hybridization studies showed HIPl niRNA to be ubiquitously and 
generally expressed throughout the brain. This data is consistent with the immunohisto- 
chemical results and was identical to the distribution pattern of huntingtin mRNA in 
15 transgenic mouse brains expressing full-length human huntingtin. 

Protein Preparation And Western Blotting For Expression Studies 

Frozen human tissues were homogenized using a Polytron in a buffer containing 
0.25M sucrose, 20mM Tris-HCl (pH 7.5), lOmM EGTA, 2mM EDTA supplemented with 

20 lOug/ml of leupeptin, soybean trypsin inhibitor and ImM PMSF, then centrifuged at 

4,000rpm for 10" at 4 C to remove cellular debris. 100-150ug/lane of protein was separated on 
8% SDS-PAGE mini-gels and then transferred to PVDF membranes. Huntingtin and HIPl 
were electroblotted overnight in Towbin*s transfer buffer (25 mM Tris-HCl, 0.1 92M glycine, 
pH8.3, 10% methanol) at 30V onto PVDF membranes (Immobilon-P, Millipore) as described 

25 (Towbin et al, Proc. Nat 7 Acad, Sci. (USA) 76: 4350-4354 (1979)), Membranes were blocked 
for 1 hour at room temperature in 5% skim milk/ TBS (lOmM Tris-HCl, 0.1 5M NaCl, 
pH7.5). Antibodies against huntingtin (pAb BKPl, 1:500), actin (mAb A-4700, Sigma, 
1:500) or HIPl (pAb HlP-pepl, 1:200) were added to blocking solution for 1 hour at room 
temperature. After 3 x 10 minutes washes in TBS-T (0.05% Tween-20/TBS), secondary Ab 

30 (horseradish peroxidase conjugated IgG, Biorad) was applied in blocking solution for 1 hour 
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at room temperature. Membranes were washed and then incubated in chemiluminescent ECL 
solution and visualized using Hyperfilm-ECL fihn (Amersham). 

Generation of Antibodies 
5 The generation of huntingtin specific antibodies GHMl and BKPl is described 

elsewhere (Kalchman, et al., J, Biol Chem. 271: 19385-19394 (1996)). The HIPl peptide 
(VLEKDDLMDMDASQQN, a.a. 76-91 of Seq. ID No. 2) was synthesized with Cys on the 
N-tenninus for the coupling, and coupled to Keyhole limpet hemocyanin (KLH) (Pierce) 
with succinimidyl 4-(N-maleimidomethyl) cyclohexame- 1 -carboxylate (Pierce) . Female 
10 New Zealand White rabbits were injected with HIPl peptide-KLH and Freund's adjuvant. 
Antibodies against the HIPl peptide were purified from rabbit sera using affinity column 
with low pH elution. Affinity column was made by incubation of HIPl peptide with 
activated thio-Sepharose (Pharmacia). 

Western blotting of various peripheral and brain tissues were consistent with the KNA 
15 data. The HIPl protein levels observed was not equivalent in all tissues. The protein 

expression is predominant in brain tissue, with highest amounts seen in the cortex and lower 
levels seen in the cerebellum and caudate and putamen. 

More regio-specific analysis of HIPl expression in the brain revealed no differential 
expression pattern in affected individuals when compared to normal controls, with highest 
20 levels of expression seen in both controls and HD patients in the cortical regions. 

EXAMPLE 8 

CQ-IMMUNOPRECIPITATION OF HIPl WITH HUNTINGTIN 
Confirmation of the HD-HIPl interaction was performed using coimmunoprepitation 
as follows. Control human brain (frontal cortex) lysate was prepared in the same manner as 
25 for subcellular localization study. Prior to immunoprecipitation, tissue lysate was 

centrifuged at 5000 rpm for 2 minutes at 4 C, then the supernatant was pre-cleared by the 
incubated with excess amount of Protein A-Sepharose for 30 minutes at 4 °C, and 
centrifuged at the same condition. Fifty microlitres of supernatant (500 mg protein) was 
incubated with or without antibodies (10 ug of anti-huntingtin GHMl (Kalchman, et al. 1996) 
30 or anti-synaptobrevin antibody) in the total 500 ul of incubation buffer (20mM Tris-Cl 
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(pH7.5), AOmM NaCl, ImM MgCy for 1 hour at 4''C. Twenty microlitres of Protein 
A-Sepharose (1:1 suspension, for GHMl and no antibody control) or Protein G-Sepharose 
(for anti-synaptobrevin antibody; Pharmacia) was added and incubated for 1 hour at 4°C. 
The beads were washed with washing buffer (incubation buffer containing 0.5 % Triton 
5 .X-100) three times. The samples on the beads were separated using SDS-PAGE (7.5% 

aciylamide) and transferred to PVDF membrane (Immobilon-P, Millipore). The membrane 
was cut at about 150 kDa after transfer for Western blotting (as described above). The upper 
piece was probed with anti-huntingtinBKPl (1/1000) and lower piece with anti-HIPl 
antibody (1/300). 

10 The results showed that when an anti-HIPl polyclonal antibody was immunoreacted 

against a blot containing the GHMl immunoprecipitates from the brain lysate a doublet was 
observed at approximately 100 kDa, When GHMl was immunoreacted against the same 
immunoprecipitate the 350 kDa HD protein was also seen The specificity of the HD-HIPl 
interaction is seen as no immunoreactive bands seen are as a result of the proteins adsorbing 

15 to the Protein-A-Sepharose (Lysate + No Antibody) or when a random, non related antibody 
(Lysate anti-Synaptobrevin) is used as the immunoprecipitating antibody. 



EXAMPLE 9 

Subcellular fractionation of brain tissue 
20 Cortical tissue (20-100 mg/ml) was homogenized, on ice, in a 2 ml pyrex- teflon 

IKA-RW15 homogenizer (Tekmar Company) in a buffer containing 0303M sucrose, 20mM 
Tris-HCl pH 6.9, ImM MgClz, 0.5mM EDTA, ImM PMSF, ImM leupeptin, soybean 
trypsin inhibitor and ImM benzamidine (Wood et al., Human Molec. Genet 5: 481-487 
(1996)), 

25 Crude membrane vesicles were isolated by two cycles of a three-step differential 

centrifugation protocol in a Beckman TLA 120,2 rotor at 4 C based on the methods of Wood 
et al (1996). The first step precipitated cellular debris and nuclei from tissue homogenates for 
5 minutes at 1300 x g (PI). The 1300 x g supernatant was subsequently centrifuged for 20 
minutes at 14 000 x g to isolate synaptosomes and mitochondria (P2). Finally, microsomal 
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and plasma membrane vesicles were collected by a 35 minute centrifugation at 142 000 x g 
(P3). The remaining supernatant was defined as the cytosolic fraction. 

High salt extraction of membranes 
5 Aliquots of P3 membranes were twice suspended at 2mg/ ml in 0.5M NaCl, lOmM 

Tris-HCl, 2mM MgClj, pH7.2, containing protease inhibitors (see above). The same buffer 
without NaCl was used as a control. The membrane suspensions were incubated on ice for 30 
minutes and then centrifuged at 142 000 x g for 30 minutes. 

10 Extraction of cvtoskeletal and cvtoskeletal-associated proteins. 

To extract cytoskeletal proteins, crude membrane vesicles from the P3 fi-action 
membrane were suspended in a volume of Triton X-100 extraction buffer to give a protein: 
detergent ratio of 5:1. The composition of the Triton X-100 extraction buffer was based on 
the methods of Arai et al., J. Neuroscience 38: 348-357 (1994) and contained 2% Triton 

15 X-100, lOmM Tris-HCl, 2mM MgClz, ImM leupeptin, soybean trypsin inhibitor, PMSF and 
benzamidine. Membrane pellets were suspended by hand with a round-bottom teflon pestle, 
and placed on ice for 40 minutes. Insoluble cytoskeletal matrices were precipitated for 35 
minutes at 142 000 x g in a Beckman TLA 120.2 rotor. The supernatant was defined as 
non-cytoskeletal-associated membrane or membrane— associated protein and was removed. 

20 The remaining pellet was extracted with Triton X- 1 00 a second time using the same 

conditions. We defined the final pellet as cytoskeletal and cytoskeletal-associated protein. 

Solubilization of protein and analvsis bv SDS-PAGE and Western Blotting 

Membrane and cytoskeletal protein was solubilized in a minimum volume of 1% 

25 SDS, 3M urea, 0. 1 mM dithiothreitol in TBS buffer and sonicated. Protein concentration was 
determined using the BioRad DC Protein assay and samples were diluted at least 1 X with 5 
X sample buffer (250mM Tris-HCl pH 6.8, 10% SDS, 25% glycerol, 0.02% bromophenol 
blue and 7% 2-mercaptoethanol) and were loaded on 7.5% SDS-PAGE gels (Bio-Rad 
Mini-PROTEIN II Cell system) without boiling. Western blotting was performed as 

30 described above. 
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ImTnunohistocheniistrv 

Brain tissue was obtained from a normal C57BL/6 adult (6 months old) male mouse 
sacrificed with chloroform then perfusion-fixed with 4% v/v paraformaldehyde/0.01 M 
phosphate buffer (4% PFA). The brain tissues were removed, immersion fixed in 4% PFA 
5 for 1 day, washed in O.OIM phosphate buffered saline, pH 7,2 (PBS) for 2 days, and then 
equilibrated in 25% w/v sucrose PBS for 1 week. The samples were then snap-frozen in 
Tissue Tek molds by isopentane cooled in liquid nitrogen. After warming to -20 C, frozen 
blocks derived from frontal cortex, caudate/putaraen, cerebellum and brainstem were cut into 
14 mm sections for immunohistochemistry. Following washing in PBS, the tissue sections 

10 were blocked using 2.5% v/v normal goat serum for 1 hour at room temperature. Primary 

antibodies diluted with PBS were applied to sections overnight at 4 C. Optimal dilutions for 
the polyclonal antibodies BKPl and HIPl were 1:50. Using washes of 3 x 5 minutes in PBS 
at room temperature, sections were sequentially incubated with biotinylated secondary 
antibody and then an avidin-biotin complex reagent (Vecta Stain ABC Kit, Vector) for 60 

15 minutes each at room temperature. Color was developed using 3-3'-diaminobenzidine 
tetrahydrocholoride and ammonium nickel sulfate. 

For controls, sections were treated as described above except that HIPl antibody 
aliquots were preabsorbed with an excess of HIPl peptide as well as a peptide unrelated to 
HIPl prior to incubation with the tissue sections. 

20 

In situ hybridization 

In situ hybridization was performed as previously described with some modification 
(Suzuki et al, BBRC 219: 708-713 (1996)). The UNA probes were prepared using the 
plasmidgtl49 (Lin, B., et al., Human Molec. Genet 2: 1541-1545 (1994)) or a 558 subclone 

25 of HIPl. The anti-sense and sense single-stranded RNA probes were synthesized using T3 
and T7 RNA polymerases and the In Vitro Transcription Kit (Clontech) with the addition of 
[a^^S]-CTP (Amersham) to the reaction mixture. Sense RNA probes were used as negative 
controls. For HIPl studies normal C57BL/6 mice were used. Huntingtin probes were tested 
on two different transgenic mouse strains expressing full-length huntingtin, cDNA HD 10366 

30 (44CAG) C57BL/6 mice and YAC HD10366(18CAG) FVB/N mice. Frozen brain sections 
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(lOum thick) were placed onto silane-coated slides under RNase-free conditions. The 
hybridization solution contained 40% w/v formamide, 0.02M Tris-HCl (pH 8.0), 0.005M 
EDTA, 0.3 M NaCI, O.OIM sodium phosphate (pH 7.0), Ix Denhardt's solution, 10% w/v 
dextran sulfate (pH 7.0), 0.2% w/v sarcosyl, yeast tRNA (500mg/ml) and salmon spem DNA 
5 (200mg/ml). The radiolabelled RNA probe was added to the hybridization solution to give 1 
X 106 cpni/200 ul/ section. Sections were covered with hybridization solution and incubated 
on formamide paper at 65 C for 1 8 hours. After hybridization, the slides were washed for 30 
minutes sequentially with 2x SSC, Ix SSC and high stringency wash solution (50% 
formamide, 2x SSC and 0. IM dithiothreitol) at 65 C, followed by treatment with RNAse A 

10 (Img/ml) at 37 C for 30 minutes, then washed again and air-dried. The shdes were first 

exposed on autoradiographic film (b-max, Amersham, UK) for 48 hours and developed for 4 
minutes in Kodak D-19 followed by a 5 minute fixation in Fuji-fix. For longer exposures, the 
slides were dipped in autoradiographic emulsion (50% w/v in distilled water, NR-2, Konica, 
Japan), air-dried and exposed for 20 days at 4 C then developed as described. Sections were 

15 counterstained with methyl green or Giemsa solutions. 

EXAMPLE 10 

We determined a more precise location of the HIPl gene on chromosome 7 in the 
context of a physical and genetic map of chromosome 7, and determined its genomic 

20 organization. HIPl maps by FISH and RH mapping to chromosome band 7ql 1.23, which 
contains the chromosomal region commonly deleted in Williams-Beuren syndrome (WS). 
We used several methods to refine the mapping of HIPl in this region. PCR screening of a 
chromosome 7-YAC-Iibrary (Scherer et al., mammalian Genome 3: 179-181 (1992)) with 
primers fi-om the 3' UTR of HIPl resulted in the identification of only a single positive YAC 

25 clone (HSC7E512). This YAC clone had previously been shown to map near the Williams 
syndrome commonly deleted region (Osborne et al., Genomics 45: 402-406 (1997)). The 
HIPl cDNA was then used to screen a chromosome 7 specific cosmid library fi*om the 
Lawrence Livermore National Laboratory (LL07NC01), and the RPCI genomic PI derived 
artificial chromosome (PAC) library (Pieter de Jong, Rosswell Park, Buffalo, NY). Several 

30 PAC and cosmid clones that were already part of pre-assembled contigs in the Williams 
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syndrome region at 7ql 1.23 were identified (Fig 5). Restriction enzyme digestion, blot 
hybridization experiments and PCR screening confirmed that the clones contained the HIPl 
gene. 

We determined the exon-intron boundaries and intron sizes of HIP 1 , Primers were 
5 designed based on the sequence of the HIPl transcript and used to sequence directly from the 
cosmid, PAC clone and long PCR products from PAC or genomic DNA. Whenever a PCR 
fragment generated was longer than predicted from the cDNA sequence, it was assumed to 
contain an intron. The size of the introns was determined by sequencing the intron directly 
or by PCR amplification of the introns from both genomic DNA and the cosmid or PAC 

10 clone from the region. Three sets of overlapping cosmids and a PAC clone that contain the 
entire coding sequence of HIPl were characterized (Fig 5). Cosmid 181G10 and 250F2 
were digested with EcoRI and cloned into the plasmid bluescript. Further sequences were 
generated from these plasmid subclones. Intron-exon boundary sequences were then 
identified by comparing HIPl genomic and transcript sequence. The gene is contained within 

15 75 kb and comprises 29 exons and 28 introns. The intron-exon boundary sequences are 

shown in Table 4, along with the exon and intron sizes. A graphic summary of these data is 
also shown in Fig. 5. Exons 1 to 28 contained the coding regions. The last and largest 
exon of the HIPl gene was found to contain approximately 7 kb. Most of the intron-exon 
junctions followed the canonical GT-AG rule. An AT was found at the 3' splice site of exon 

20 1 and an AC at the 5* splice site of exon 2. Sequence data from all the exon-intron borders 
of the coding region and 3'-UTR is set forth in Seq. ID Nos. 16-44. (These sequence have 
been deposited with GenBank as Accession Nos. AF052261 to AF052288). 

Sequence analysis of previously published 5* untranslated region (GenBank accession 
U79734) revealed the possibility that the open reading frame extends upstream of the ATG in 

25 the exon 4 to a 5' ATG in exon 1 . Although we failed to obtain any additional 5' sequences 
despite repeated 5' RACE analyses, an additional ATG, 284 bp upstream of the previously 
published exon 1 is in the same reading frame and has the surrounding sequence of 
TGCCATGTT which is similar to the AGCCATGGG, the consensus Kozak sequence 
(Kozak, M. NucL Acids Res. 15: 8125-8148 (1987)). If translated from this ATG, the protein 

30 would be highly homologous to the N-terminal portion of ZK370.3 and yeast Sla2 protein 
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(Fig. 6)- The translated protein in the region of exons 1 to 3 shows an identity of >40% and 
similarity of >60% to the N-terminal part of ZK370.3. This suggests that the exons 1 to 3 
are probably translated. 

In western blot studies, HIPl is identified as a 120 kd protein (11, 23), while the 
5 putative translation of the previously published cDNA gives a protein product of estimated 
molecular weight of approximately 100 kd. If HIPl gene were translated from the ATG 284 
bp upstream of the exon 1, the expected product would have an estimated molecular weight 
of 122 kd. RNA PCR studies with primers downstream of this ATG and primers in exon 7 
amplify expected products of 576 and 600 bp. Taken together these data support the 

10 contention that exon 1 extends further 5' and that HIPl gene is translated from the ATG in 
exon L Sequence analyses showed no TATA, CAAT box or any GC rich promoter sequence 
upstream of exon 1 ATG. The promoter prediction programs provided by the server 
http://dot.imgen.bcm.tmc.edu; 9331/seq.search/gene.search.html did not predict any promoter 
upstream of the ATG at position -284, (position 0 corresponds to the first nucleotide of 

15 published cDNA, GenBank accession U79734). This suggests that HIPl may have additional 
exons. 

Finally, we evaluated HIPl gene as a candidate gene for Huntington disease in 
families without CAG expansion. In a large study of 1022 patients with a clinical diagnosis 
of HD, no CAG repeat expansion was found in 12 patients who might represent phenocopies 

20 of HD. In at least three families, linkage studies have excluded the HD locus at 4p. 

Mutation in an interacting protein could result in a similar phenotype as illustrated by the 
discovery of mutations in dystrophin associated proteins in muscular dystrophies. A 
mutation in HIPl may result in aUered interaction of huntingtin and HIPl and lead to cellular 
toxicity as a result of more HIPl being free in the cytosol. Thus mutations in huntingtin 

25 interacting proteins genes may cause a phenotype suggestive of HD. We studied two of the 
larger families diagnosed with HD without CAG expansion in HD gene, with the highly 
informative marker D71816 which maps centromeric and very close to HIPl gene. The 
clinical findings in both the families were compatible with a diagnosis of HD, although there 
were atypical features. In family 1733, HIPl locus appears to be excluded, as there are two 

30 recombinants with the marker. Individuals II-5 and II-7 who do not share the haplotype with 
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the affected individuals are now 41 and 39 years old and have normal neurological 
examinations. 

In the family 1602, a lod score of 1.92 is obtained with the marker D7S1 816 at 6^a^=0. 
Sequencing of all the coding exons did not reveal any mutation in any exon sequence. The 
5 promoter sequence has not been examined. Subsequently a whole genome scan revealed a 
higher lod scores for markers on chromosome 20p. 

EXAMPLE 11 

A mouse brain lambda ZAPII cDNA library (Stratagene # 93609) was screened with 
10 various mouse ESTs which showed homology to the human HIPl cDNA sequence (see Fig. 
7). The ESTs were initially isolated from the non-redundant Database of GenBank EST 
Division by perfomiing a BLASTN using a fragment of the human HIPl cDNA as the query. 
We obtained 4 different ESTs which showed homology to HIPl : 1) aal 10840 (clone 520282) 
which is 399bp and shows 58% identity, at the nucleotide level, to position 1880 to 2259 of 
15 the HIPl cDNA, 2) w82687 (clone 404331) which is 420bp and shows 66% identity, at the 
nucleotide level, to position 2750 to 2915 of the HIPl cDNA. 3) aal 38903 (clone 586510) 
which is 509bp and shows 88% identity, at the nucleotide level, to position 2763 to 2832 of 
the HIPl cDNA. 4) aa388714 (569088) which is 404bp and shows 88% identity, at the 
nucleotide level, to position 2475 to 2692 of the HIPl cDNA. 

20 

mHIPl : 

Fifty nanograms of a 362bp Kpnl & PvuII fragment of clone 569088 (containing EST 
aa388714) was radioactively labeled with [32-P]-dCTP using random-priming. The probe 
was allowed to hybridize to fihers containing > 2x 10^ pfu/ml of the mouse brain lambda 

25 ZAPII cDNA library (Stratagene # 93609) overnight at 65 ''C in Church buffer (0.5M sodium 
phosphate buffer (pH 7.2), 2.7% SDS, ImM EDTA). The filters were washed at room 
temperature for 15 minutes with 2XSSPE, 0.1% SDS, then at 65°C for 20 minutes with 
IXSSPE, 0.1%SDS and finally twice at 65 °C with 0.5 XSSPE, 0.1%SDS, The filters were 
exposed to X-ray film (Kodak, XAR5) overnight at -70 C. Primary positives were isolated^ 

30 replated and subsequent secondary positives were hybridized and washed as for the primary 
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screen. The resulting positive phage was converted into plasmid DNA by conventional 
methods (Stratagene) and the cDNA termed 4n-nl, v^as isolated and sequenced 551bp and 
541bp from the T7 and T3 end, respectively. 4n-nl is 2.2kb in length and the T7 end showed 
72% identity, at the nucleotide level, to position 1486 to 1715 of the HIPl cDNA. The 2.2kb 
5 insert from 4n-nl was excised using EcoRl . Fifty nanograms of the 2.2kb insert was used to 
produced a radioactive probe and used to screen the mouse brain lambda ZAPII cDNA library 
(Stratagene # 93609) in the same manner as above. The resulting positive phage was 
converted into plasmid DNA by conventional methods (Stratagene) and the cDNA termed 
mHIPla, was isolated and completely sequenced. mHIPl is 2,3kb in length and showed 85% 
10 identity, at the nucleotide level, to position 726 to 3072 of the HIPl cDNA. 

mHIPla : 

Fifty nanograms of a l,3kb EcoRI & Ncol fragment of clone 404331 (containing EST 
w82687) was radioactively labeled with [32-P]-dCTP using random-priming. The probe was 

15 allowed to hybridize to filters containing > 2x 1 0^ pfu/ml of the mouse brain lambda ZAPII 
cDNA library (Stratagene # 93609) overnight at 65°C in Church buffer (see above). The 
filters were washed at room temperature for 15 minutes with 2XSSPE, 0.1% SDS, then at 
65° C for 20 minutes with IXSSPE, 0.1%SDS and finally twice at 65 °C with 0.2XSSPE, 
0.1%SDS. The filters were exposed to X-ray film (Kodak, XAR5) overnight at -70*'C. 

20 Primary positives were isolated, replated and subsequent secondary positives were 

hybridized and washed as for the primary screen. The resulting positive phage was converted 
into plasmid DNA by conventional methods (Stratagene) and the cDNA termed mHIPla, was 
isolated and completely sequenced. mHIPla is 3.96 kb in length and shows 60% identity, at 
the nucleotide level, to position 12 to 2703 of the HIPl cDNA. 

25 

EXAMPLE 12 

HIPl a : 

The entire mHIPla cDNA sequence was used to screen the non-redundant Database 
of GenBank EST Division. We identified a human EST, T08283, which showed homology to 
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mHIPla. T08383 (clone HBBB80) is 391bp and shows 87% identity, at the nucleotide level, 
to position 2904 to 31 13 of the mHIPla cDNA. 

Fifty nanograms of a 1 .6kb Hindllll 8l Not II fragment of clone 40433 1 (containing 
EST T08283) was radioactively labeled with [32-P]-dCTP using random-priming. The probe 
5 was allowed to hybridize to filters containing > 2x 105 pfu/ml of a human frontal cortex 
lambda cDNA library overnight at 65 C in Church buffer (see above). The fihers were 
washed at 65 C for 10 minutes with IXSSPE, 0.1% SDS, and then for 30 minutes and 15 
minutes with 0. IXSSPE, 0.1%SDS. The filters were exposed to X-ray film (Kodak, XAR5) 
overnight at -70 C. Primary positives were isolated, replated and subsequent secondary 
1 0 positives were hybridized and washed as for the primary screen. The resulting positive phage 
was converted into plasmid DNA by conventional methods (Stratagene) and the cDNA 
termed HIP la, was isolated and completely sequenced. HIP la is 3.2 kb in length and shows 
53% identity, at the nucleotide level, to position 876 to 3058 of the HIPl cDNA. 

15 EXAMPLE 13 

Following the identification of a 1.2 kb partial human HIP-1 cDNA by yeast 
two-hybrid interaction studies, a 3.9 kb HIP-1 fragment was isolated from a cDNA library, 
ligated to a 5* RACE product then subcloned into the mammalian expression vector pCI-neo 
(Promega). This construct, CMV-HIP-1, expresses HIP-1 from the CMV promoter and was 

20 used in the cell expression studies described below. Mouse HIP-1 a (mHIP-1 a) was also 
subcloned into a CMV driven expression vector for cell culture expression studies, 

EXAMPLE 14 

Huntingtin proteins with expanded polyglutamine tracts can aggregate into large, 
25 irregularly shaped deposits in HD brains, transgenic mice and in vitro cell culture. We have 
shown that in HEK (human embryonic kidney) 293T cells the aggregation of full-length and 
larger huntingtin fragments occurs after the cells have been exposed to a period of apoptotic 
stress. In order to assess the consequence of HIP-1 expression in cultured cells, we used 
huntingtin aggregation as one marker of viability. 
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Human embryonic kidney cells (HEK 293T) were grown on glass coverslips in 
Dulbecco's modified Eagle medium (DMEM, Gibco, NY) with 10% fetal bovine serum and 
antibiotics, in 5% C02 at 37°C. The cells were transfected at 30% confluency with the 
calcium phosphate protocol by mixing Qiagen-prepared DNA (Qiagen, CA) with 2.5 M 
5 CaClz, then incubating at room temperature for 1 0 min. 2X HEPES buffer (240 mM NaCl, 
3.0 mM Na2HP04, 100 mM HEPES, pH 7.05) was added to the DNA/calcium mixture, 
incubated at 37°C for 60 sec, then added to the cells. After 12-18 h, the media was removed, 
the cells were washed and fresh media was added. At 36 h post-transfection, the cells were 
exposed to an apoptotic stress by treatment with 35 uM tamoxifen (Sigma) for 1 hour, or left 

10 untreated, then processed for immunofluorescence. The cells were washed with PBS, fixed in 
4% paraformaldehyde/PBS solution for 20 minutes at room temperature then permeabilized 
in 0.5% Triton X-IOO/PBS for 5 min. Following three PBS washes, the cells were incubated 
with anti-huntingtin antibody MAB2166 (Chemicon) (1 :2500 dilution) and anti-HIP-1 
antibody HIP- Ifp (1:100 dilution) in 0.4% BSA/PBS for 1 h at room temperature in a 

15 humidified container. The primary antibody was removed, the cells were washed and 

secondary antibodies conjugated to Texas red or FITC were added at a 1 :600-l :800 dilution 
for 30 min at room temperature. The cells were then washed again, and the coverslips were 
mounted onto slides with DAPI (4',6'-diamindino-2 phenyhndole, Sigma) as a nuclear 
counter-stain. Immunofluorescence was viewed using a Zeiss (Axioscope) microscope, 

20 digitally captured with a CCD camera (Princeton Instrument Inc.) and the images were 
colourized and overlapped using the Eclipse (Empix Imaging Inc.) software program. 
Appropriate control experiments were performed to determine the specificity of the 
antibodies, including secondary antibody only and mock transfected cells. 

The huntingtin fragment HD1955 was used in the aggregation studies. This fragment 

25 represents the N-terminal 548 amino acids of huntingtin, and corresponds approximately to 
the polyglutamine-containing fragment produced by caspase 3 cleavage of huntingtin. 
Transfection of HD1955 with 15 polyglutamines (HD1955-15) results in a diffuse 
cytoplasmic distribution of the expressed protein. Transfection of HD 195 5 with 128 
polyglutamines (HDl 955-1 28) also results in diffuse cytoplasmic expression. However, 

30 exposure of cells transfected with HD1955-128 to tamoxifen results in a marked 
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redistribution of huntingtin. In 29% of cells expressing HDl 955-128, the huntingtin protein 
appears as dense aggregates that are localized in the perinuclear area of the cell In contrast, 
less than 1% of HD1955-128 expressing cells contain aggregates in the absence of tamoxifen, 
and 0% of HDl 955- 15 cells form aggregates in the presence or absence of tamoxifen 
5 treatment. 

Co-transfection of HIP-1 and HD1955 was used to test the influence of HIP-I on 
huntingtin aggregation. As a control, b-galactosidase was co-transfected with HDl 955. In the 
control transfections, 1-2% of cells expressing HDl 955- 128 formed aggregates in the absence 
of tamoxifen, similar to HD1955-128 expressed alone. However, when HD1955-128 was 

10 co-expressed with HIP- 1, an average of 14% of huntingtin-expressing cells contained 

aggregates with no tamoxifen treatment. Double-labeling demonstrated that the majority of 
the cells containing aggregates also expressed HIP-1, directly implicating HIP-1 in the 
increase in aggregation. Therefore, these results indicate that HIP-1 provides sufficient stress 
on the huntingtin-expressing cells to form aggregates, to the extent that tamoxifen is no 

15 longer necessary. 

EXAMPLE 15 

We next designed a series of experiments to identify a region of HIP-1 sufficient for 
inducing aggregate formation of HD1955-128. As described above, HIP-1 contains a domain 

20 with high homology to the death effector domains (DED) present in many apoptosis-related 
protems. The DED domain of HIP-1 was ligated in-frame to HD1955-128, 3* from the 
caspase-3 cleavage site. Transfection of the resulting fusion protein with the DED ligated in 
the sense orientation (HD1955-128-DEDsense) resulted in a large number (30-50%) of cells 
containing aggregates, without tamoxifen incubation. In contrast, expression of a 

25 huntingtin-DED fusion protein with DED in the antisense orientation 

(HD1955-128-DEDantisense) did not have more aggregates than the HD1955-128 no 
tamoxifen control. Therefore, the DED domain of HIP-1 is sufficient to stress the cells, 
causing aggregate formation. 
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EXAMPLE 16 

To directly assess the effect of HIP- 1 expression on cell viability, mitochondrial 
function tests were performed on 293T cells transfected with HIP-1. The assessment of 
mitochondrial function, using the MTT assay (Carmichael et al.. Cancer Res. 47: 936-942 
5 (1987); Vistica et al., Cancer Res. 51: 2515-2520 (1991)), is a standard method to measure 
cell viability. The MTT assay quantitates the formation of a coloured substrate (formazan), 
with the mitochondria of viable cells forming more substrate than non-viable cells. Since 
decreased mitochondrial activity is an early consequence of many cellular toxins, the MTT 
assay provides an early indicator of cell damage. 

10 For cell viability assays, HEK 293T cells were seeded at a density of 5 x 10"* cells into 

96-well plates and transfected with 0.1 ug or 0.08 ug HIP-1 or 0.1 ug of the control construct 
lacZ using the calcium phosphate method described above. At 24-36 hours post-transfection 
tamoxifen-treated cells were incubated for 2 hours in a 1:10 dilution of WST-1 reagent 
(Boehringer Mannheim) and release of formazan from mitochondria was quantified at 450 

15 nm using an ELISA plate reader (Dynatech Laboratories) (Carmichael et aL, 1987; Mosmann, 
7. Immunol. Meth 65: 55-63 (1983)). One way ANOVA and Newman-Keuls test were used 
for statistical analysis. The transfection efficiency, measured by P-galactosidase staining and 
immunofluoresence, was approximately 50%. 

We have previously demonstrated that expression of mutant huntingtin results in 

20 increased susceptibility to an apoptotic stress induced by sub-lethal doses of tamoxifen in 
transfected 293T cells (Martindale et al., 1998). A similar assay was used to test the 
consequence of HIP-1 expression. With 0.1 ug transfected HIP-1 DNA, after 24 hr 
expression, HIP-1 resulted in increased cell death in response to tamoxifen, compared with 
the tamoxifen-treated P-galactosidase control (p<0.01, n=4). Reducing the amount of 

25 transfected HIP-1 DNA to 0.08 ug also resulted in increased cell death compared with control 
(p<0.01, n=4), indicating the high potency of HIP-1 (Fig. 8). Furthermore, increased cell 
death in cells transfected with HIP-1 was observed in the absence of apoptotic stress at 48 hrs 
post-transfection, but was so severe that is could not be accurately quantitated. Thus, an 
earlier time point (24 hr) had to be used for better reproducibility, using an apoptotic stress to 

30 unmask the phenotype. 
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In order to model a pathogenic interaction of HIP- 1 and huntingtin in the HEK 293 
manunahan cell system, HIP-1 was transfected into cell lines stably expressing huntingtin. 
Two cell lines were chosen for the initial studies, one line expressed the truncated HD1955 
construct with 15 glutamines, and the second expressed the HD1955 with 128 repeats. 
Western blotting indicated that the cell lines expressed huntingtin at similar levels. To assess 
whether HIP-1 is toxic in the presence of mutant huntingtin, 0.1 ug HIP-1 DNA was 
transfected into the HD1955-128 cell line. Transfection of HIP-1 into the HD1955-15 cell 
line was used as the wild-type huntingtin control, and transfection of LacZ into both cell lines 
was the control for transfection-related toxicity (Figs 9A and 9B). MTT toxicity assays 
showed that HIP-1 in the presence of mutant huntingtin (HD1955-128) was significantly 
more toxic than HIP-1 with wild-type huntingtin (HD 1955- 15), p<0.001, n=4 (Fig. 9C). This 
toxicity was observed at 24 hr and 36 hr post-transfection. No tamoxifen was needed to 
unmask the phenotype, suggesting that the combined cell stress of HIP-1 with truncated 
huntingtin was sufficient to reduce cell viability over control. 
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CLAIMS 



1 1. A polypeptide comprising the sequence given by Seq. ID. No. 5. 

1 2. A cDNA molecule comprising the sequence given by Seq. ID No. 6. 

1 3. A polypeptide comprising the sequence given by Seq. ID No. 7. 

1 4. A method for ameliorating the effects of Huntington's disease in a 

2 patient expressing a HIP-apoptosis modulating protein, comprising the step of administering 

3 the patient a therapeutic composition which reduces the activity of the HIP-apoptosis 

4 modulating protein. 

1 5. A method according to claim 4, wherein the composition comprises a 

2 material which binds to HIP-apoptosis modulating protein. 

1 6. The method according to claim 4, wherein the composition comprises 

2 an expression vector encoding huntingtin having a normal number of repeats. 

1 7. An expression vector for expression of a gene in a mammalian host 

2 comprising a region encoding an HD-interacting polypeptide. 

1 8. The expression vector according to claim 7, wherein the HD- 

2 interacting polypeptide is an HIP-apoptosis modulating protein. 

1 9. The expression vector according to claim 8, wherein the HIP-apoptosis 

2 modulating protein has a sequence which includes the amino acid sequences given by SEQ 

3 IDNos.2,4,5or7. 
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1 10. The expression vector of claim 7, wherein the HD-interacting 

2 polypeptide interacts differently with expanded Huntingtin than with Huntingtin having a 

3 CAG repeat region containing 15 to 35 repeats. 

1 11. The expression vector according to claims of claims 7- 1 0, further 

2 comprising a region encoding Huntingtin having a polyglutamine tract of 35 or fewer. 

1 12. A method for inducing apoptotic death in cells, comprising the step of 

2 introducing into the cells an expression vector encoding at least the death effector domain of 

3 a fflP-apoptosis modulating protein whereby the death effector domain is expressed by the 

4 cells. 

1 13. The method of claim 1 2, wherein the expression vector encodes the 

2 amino acid sequence given by Seq, ID. No. 2. 

1 14. The method of claim 12, wherein the expression vector encodes the 

2 amino acid sequence given by Seq. ID. No. 4. ^ 

1 1 5. A method for screening a composition for the ability to inhibit 

2 apoptosis induced by an HIP-apoptosis modulating protein, comprising simultaneously 

3 exposing a population of cells to the composition and an HIP-apoptosis modulating protein 

4 and measuring the extent of cell death. 
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>Usux:pin A ^ 

SAEVIHQVEEALDTDEKKMLLFLCRDVAIDWPPNVROLLDILRERGKLSVCDLAELLYRVHRFDLLKRILK 
>U3urpin: B 

VRVLMAHXGEDLDKSDVSSLIFLMKDYMGRGKISXHKSFLDLVVELHKLNLVAPDQLDLLEKCLKNIHRI^^ 
>Casp-8 h 

FSRNLyDIGELQDSEDLASLKELSLDyiPQRKOEPIKDALMirQRLOEKRl'ILEESNLSFLKELLFRINRLDLLITyLN 
>Casp-8 B 

YRVMLYQISEEVSREELRSFKFLLQHEISKCKlDDDMNLLDiriEMEtCRVILGEGKLDILKRVCAQINfCSLLKIND 
>Casp-10 A 

FRHfaLTIDSNLGVQDVENLKrLCIGLVPNKKLEKSSSASDVFEIILLAflDLLSEEDPFFLAELIiYIIRQKKLLQHLNC 
>Casp-10i B 

FRNLLyELBEGIDSSNLKDMIFLLKDSLPKTEMTSLSFLAFLEKQGiaDEDNLTCLEDLCKTWPKLLRNIEK 
>FADD 

FLVLLHSVSSSLSSSELTELKFLCLGRVGKRKLERVQSGLDLFSMLLEQNDLEPGHTELLRELLASLRRIIDLLRRVDD 
>MC159 A 

SLPFLRHLLEELDSKEDSLLLFLGHDAAPGCTTVTQALCSLSQQRKLTLAALVEMLYVLQP^IDLLKSRFG 
>MC159 B 

yHKLMVCVGEELDSSELRALRLFACNLNPSLSTALSESSRPVELVLALENVGLVSPSSVSVLADMtRTLRRLDLCQQLVE 
>E8 

FRCLMALVNDFLSDKEVEEHYFLCAPRLESHLEPGSKKSrLHLASIiLEDLELLGGDKLTFLRKLLTTIGRADLVKNL^^ 
>K3 orfkl3A 

TYEVLCEVARKLGTDDREVVLFLLNVrLPQP'TLAQLIGALRALKEEGRLTFPLLAECLPRAGRRDLLKDLLh 
>KS orf icl3B 

yQLTVLHVDGELCARDrRSLrFLSKDTrGSRSTPQTFLHNVyCME^ILDLLGPTDVDALMS^^LRSI;SRVDrX3HQVQT 
>HIP1 

SELEADLAEQQHLRQQAADDCEFLRAELDELRRQREDTEKAQRSLSElERKAQANEQRYSKLKEKYSELVQNHADIiLRfa] 
A£ 

>HIPla 

GELEEQRKQKQKALVDNEQLRHELAQLRAAQLERERSOGLREEAERKASATEARYNKLKEKHSELVHVIIAELLRKNAD 
>niHIPla 

NGLEAELEEQRKQKQKALVDNEQLRHEUVQLKALOLEGARNQGLREEAERKASATEARYSKLKEKHSELINTHAELLRPCN 
AD 

>mHIPl 

SELEAELA'EQQHLGRQAMDDClCFLRTELDELKROREDTEI<;i.QRSLTEI£RKAQANEQRYSKLKEKySELVQNHADLLRKN 
AE 
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SEQUENCE LISTTNC. 

(1) GENERAL INFORMATION: 

(1) APPUCANT:Kalchman, Michael 
Hayden. Michael R. 

Hackam, Abigail 
Chopra, Vikramjit Singh 
Nicholson, Donald W. 
Valiaincourt, John P. 
Rasper, Dita M. 

(ii) TITLE OF INVENTION: Apoptosis Modulators That Interact with the 
Huntington's Disease Gene 

(iii) NUMBER OF SEQUENCES: 44 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Oppedahl & Larson 

(B) STREET: PO Box 5270 

(C) CITY: Frisco 

(D) STATE; CO 

(E) COUNTRY: USA 

(F) ZIP: 80443-5270 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1.44 Kb storage 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: MS DOS 5.0 

(D) SOFTWARE: WordPerfect 

(vi) CURRENT APPUCATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSMCATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Larson, Marina T. 

(B) REGISTRATION NUMBER: 32038 

(C) REFERENCE/DOCKET NUMBER: UBC.P-013US2 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (970) 668-2050 

(B) TELEFAX: (970) 668-2052 

(2) INFORMATION FOR SEQ ID NO: 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1164 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: cDNA for Huntingtin-interacting protein 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ACAGCTGACA CCCTGCAAGG CCACCGGGAC CGCTTCATGG AGCAGTTTAC 50 
AAAGTTGAAA GATCTGTTCT ACCGCTCCAG CAACCTGCAG TACTTCAAGC 100 
GGGTCATTCA GATCCCCCAG CTGCCTGAGA ACCCACCCAA CTTCCTGCGA 150 
GCCTCAGCCC TGTCAGAACA TATCAGCCCT GTGGTGGTGA TCCCTGCAGA 200 
GGCCTCATCC CCCGACAGCG AGCCAGTCCT AGAGAAGGAT GACCTCATGG 250 
ACATGGATGC CTCTCAGCAG AATTTATTTG ACAACAAGTT TGATGACNTC 300 
TTTGGCAGTT CATCCAGCAG TGATCCCTTC AATTTCAACA GTCAAAATGG 350 
TGTGAACAAG GATGAGAAGG ACCACTTAAT TGAGCGACTA TACAGAGAGA 400 
TCAGTGGATT GAAGGCACAG CTAGT^AAACA TGAAGACTGA GAGCCAGCGG 450 
GTTGTGCTGC AGCTGAAGGG CCACGTCAGC GAGCTGGAAG CAGATCTGGC 500 
CGAGCAGCAG CACCTGCGGC AGCAGGCGGC CGACGACTGT GAATTCCTGC 550 
GGGCAGAACT GGACGAGCTC AGGNGGCAGC GGGAGGACAC CGAGAAGGCT 600 
CAGCGGAGCC TGTCTGAGAT AGAAAGGAAA GCTCAAGCCA ATGAACAGCG 650 
ATATAGCAAG CTAAAGGAGA AGTACAGCGA GCTGGTTCAG AACCACGCTG 700 
ACCTGCTGCG GAAGAATGCA GAGGTGACCA AACAGGTGTC CATGGCCAGA 750 
CAAGCCCAGG TAGATTTGGA ACGAGAGAAA AAAGAGCTGG AGGATTCGTT 800 
GGAGCGCATC AGTGACCAGG GCCAGCGGAA GACTCAAGAA GAGCTGGAAG 850 
TTCTAGAGAG CTTGAAGCAG GAACTTGGCA CAAGCCAACG GGAGCTTCAG 900" 
GTTCTGCAAG GCAGCCTGGA AACTTCTGCC CAGTCAGAAG CAAACTGGGC 950 
AGCCGAGTTC GCCGAGCTAG AGAAGGAGCG GGACAGCCTG GTGAGTGGCG 1000 
CAGCTCATAG GGAGGAGGAA TTATCTGCTC TTCGGAAAGA ACTGCAGGAC 1050 
ACTCAGCTCA AACTGGCCAG CACAGAGGAA TCTATGTGCC AGCTTGCCAA 1100 
AGACCAACGA AAAATGCTTC TGGTGGGGTC CAGGAAGGCT GCGGAGCAGG 1150 
TGATACAAGA CGCG 1164 

(2) INFORMATION FOR SEQ ED N0:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 386 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

Thr Ala Asp Thr Leu Gin Gly His Arg Asp Arg Phe Met Glu Gin 
15 10 15 

Phe Thr Lys Leu Lys Asp Leu Phe Tyr Arg Ser Ser Asn Leu Gin 
20 25 30 

Tyr Phe Lys Arg Val lie Gin lie Pro Gin Leu Pro Glu Asn Pro 
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35 40 



45 



Pro Asn Phe Leu Arg Ala Ser Ala Leu Ser Glu His He Ser Pro 
50 55 60 

Val Val Val He Pro Ala Glu Ala Ser Ser Pro Asp Ser Glu Pro 
65 70 75 

Val Leu Glu Lys Asp Asp Leu Met Asp Met Asp Ala Ser Gin Gin 
80 85 90 

Asn Leu Phe Asp Asn Lys Phe Asp Asp Phe Gly Ser Ser Ser Ser 
95 100 105 

Ser Asp Pro Phe Asn Phe Asn Ser Gin Asn Gly Val Asn Lys Asp 
110 115 120 

Glu Lys Asp His Leu He Glu Arg Leu Tyr Arg Glu He Ser Gly 
125 130 

Leu Lys Ala Gin Leu Glu Asn Met Lys Thr Glu Ser Gin Arg Val 
140 145 150 

Val Leu Gin Leu Lys Gly His Val Ser Glu Leu Glu Ala Asp Leu 
155 160 155 

Ala Glu Gin Gin His Leu Arg Gin Gin Ala Ala Asp Asp Cys Glu 
170 175 

Phe Leu Arg Ala Glu Leu Asp Glu Leu Arg Gin Arg Glu Asp Thr 
185 190 195 

Glu Lys Ala Gin Arg Ser Leu Ser Glu He Glu Arg Lys Ala Gin 
200 205 210 

Ala Asn Glu Gin Arg Tyr Ser Lys Leu Lys Glu Lys Tyr Ser Glu 
215 220 225 

Leu Val Gin Asn His Ala Asp Leu Leu Arg Lys Asn Ala Glu Val 
230 235 240 

Thr Lys Gin Val Ser Met Ala Arg Gin Ala Gin Val Asp Leu Glu 
245 250 255 

Arg Glu Lys Lys Glu Leu Glu Asp Ser Leu Glu Arg He Ser Asp 
260 265 270 

Gin Gly Gin Arg Lys Thr Gin Glu Gin Leu Glu Val Leu Glu Ser 
275 280 285 

Leu Lys Gin Glu Leu Gly Thr Ser Gin Arg Glu Leu Gin Val Leu 
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300 



Gin Gly Ser Leu Glu Thr Ser Ala Gin Ser Glu Ala Asn Trp Ala 



305 



310 



315 



Ala Glu Phe Ala Glu Leu Glu Lys Glu Arg Asp Ser Leu Val Ser 



320 



325 



330 



Gly Ala Ala His Arg Glu Glu Glu Leu Ser Ala Leu Arg Lys Glu 



335 



340 



345 



Leu Gin Asp Thr Gin Leu Lys Leu Ala Ser Thr Glu Glu Ser Met 



350 



355 



360 



Cys Gin Leu Ala Lys Asp Gin Arg Lys Met Leu Leu Val Gly Ser 



365 



370 



375 



Arg Lys Ala Ala Glu Gin Val lie Gin Asp Ala 



380 



385 386 



(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4796 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: cDNA 

(iii) HYPOTHETICAI-.: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: cDNA for Huntingtin-interacting protein 
(xi)SEQUENCE DESCRIPTION: SEQ ED NO: 3: 

CAGTGTACGG TTGATCATAT AACGCCGCGG GCGGGGATTG GTTTATATAT 50 
CGCAAATTGA TNTAGGGGGG GGGGGATGGN CAGAGATTTC GCTTCATTAG 100 
GCCATTATAA GCAGGAAGGG TTTCAAGGAA AAAAACCCAG AAAGTGCATA 150 
TTGCACCCAC CATGAGAAAG GGGCAACAGA CCTTNTGTTN TGTTNTCAAC 200 
CGCCTGCTTC TGTTTTAGCA ACGCAGTGTT TTGGTGGAAG TTGTGCCATG 250 
TGTTCCACAA ANTCTTCCGA GATGGACACC CGAACGTCCT GAAGGACTTT 300 
GTGAGATACA GAAATGAATT GAGTGACATG AGCAGGATGT GGGGCCACCT 350 
GAGCGAGGGG TATGGCCAGC TGTGCAGCAT CTACCTGAAA CTGCTAAGAA 400 
CCAAGATGGA GTACCACACC AAAAATCCCA GGTTCCCAGG CAACCTGCAG 450 
ATGAGTGACC GCCAGCTGGA CGAGGCTGGA GAAAGTGACG TGAACAACTT 500 
TTTCCAGTTA ACAGTGGAGA TGTTTGACTA CCTGGAGTGT GAACTCAACC 550 
TCTTCCAAAC AGTATTCAAC TCCCTGGACA TGTCCCGCTC TGTGTCCGTG 600 
ACGGCAGCAG GGCAGTGCCG CCTCGCCCCG CTGATCCAGG TCATCTTGGA 650 
CTGCAGCCAC CTTTATGACT ACACTGTCAA GCTTCTCTTC AAACTCCACT 700 
CCTGCCTCCC AGCTGACACC CTGCAAGGCC ACCGGGACCG CTTCATGGAG 750 
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CAGTTTACAA AGTTGAAAGA TCTGTTCTAC CGCTCCAGCA ACCTGCAGTA 800 

CTTCAAGCGG CTCATTCAGA TCCCCCAGCT GCCTGAGAAC CCACCCAACT 850 

TCCTGCGAGC CTCAGCCCTG TCAGAACATA TCAGCCCTGT GGTGGTGATC 900 

CCTGCAGAGG CCTCATCCCC CGACAGCGAG CCAGTCCTAG AGAAGGATGA 950 

CCTCATGGAC ATGGATGCCT CTCAGCAGAA TTTATTTGAC AACAAGTTTG 1000 

ATGACATCTT TGGCAGTTCA TTCAGCAGTG ATCCCTTCAA TTTCAACAGT 1050 

CAAAATGGTG TGAACAAGGA TGAGAAGGAC CACTTAATTG AGCGACTATA 1100 

CAGAGAGATC AGTGGATTGA AGGCACAGCT AGAAAACATG AAGACTGAGA 1150 

GCCAGCGGGT TGTGCTGCAG CTGAAGGGCC ACGTCAGCGA GCTGGAAGCA 1200 

GATCTGGCCG AGCAGCAGCA CCTGCGGCAG CAGGCGGCCG ACGACTGTGA 1250 

ATTCCTGCGG GCAGAACTGG ACGAGCTCAG GAGGCAGCGG GAGGACACCG 1300 

AGAAGGCTCA GCGGAGCCTG TCTGAGATAG AAAGGAAAGC TCAAGCCAAT 1350 

GAACAGCGAT ATAGCAAGCT AAAGGAGAAG TACAGCGAGC TGGTTCAGAA 1400 

CCACGCTGAC CTGCTGCGGA AGAATGCAGA GGTGACCAAA CAGGTGTCCA 1450 

TGGCCAGACA AGCCCAGGTA GATTTGGAAC GAGAGAAAAA AGAGCTGGAG 1500 

GATTCGTTGG AGCGCATCAG TGACCAGGGC CAGCGGAAGA CTCAAGAACA 1550 

GCTGGAAGTT CTAGAGAGCT TGAAGCAGGA ACTTGGCACA AGCCAACGGG 1600 

AGCTTCAGGT TCTGCAAGGC AGCCTGGAAA CTTCTGCCCA GTCAGAAGCA 1650 

AACTGGGCAG CCGAGTTCGC CGAGCTAGAG AAGGAGCGGG ACAGCCTGGT 1700 

GAGTGGCGCA GCTCATAGGG AGGAGGAATT ATCTGCTCTT CGGAAAGAAC 1750 

TGCAGGACAC TCAGCTCAAA CTGGCCAGCA CAGAGGAATC TATGTGCCAG 1800 

CTTGCCAAAG ACCAACGAAA AATGCTTCTG GTGGGGTCCA GGAAGGCTGC 1850 

GGAGCAGGTG ATACAAGACG CCCTGAACCA GCTTGAAGAA CCTCCTCTCA 1900 

TCAGCTGCGC TGGGTCTGCA GATCACCTCC TCTCCACGGT CACATCCATT 1950 

TCCAGCTGCA TCGAGCAACT GGAGAAAAGC TGGAGCCAGT ATCTGGCCTG 2000 

CCCAGAAGAC ATCAGTGGAC TTCTCCATTC CATAACCCTG CTGGCCCACT 2050 

TGACCAGCGA CGCCATTGCT CATGGTGCCA CCACCTGCCT CAGAGCCCCA 2100 

CCTGAGCCTG CCGACTCACT GACCGAGGCC TGTAAGCAGT ATGGCAGGGA 2150 

AACCCTCGCC TACCTGGCCT CCCTGGAGGA AGAGGGAAGC CTTGAGAATG 2200 

CCGACAGCAC AGCCATGAGG AACTGCCTGA GCAAGATCAA GGCCATCGGC 2250 

GAGGAGCTCC TGCCCAGGGG ACTGGACATC AAGCAGGAGG AGCTGGGGGA 2300 

CCTGGTGGAC AAGGAGATGG CGGCCACTTC AGCTGCTATT GAAACTTGCA 2350 

CGGCCAGAAT AGAGGAGATG CTCAGCAAAT CCCGAGCAGG AGACACAGGA 2400 

GTCAAATTGG AGGTGAATGA AAGGATCCTT CGTTGCTGTA CCAGCCTCAT 2450 

GCAAGCTATT CAGGTGCTCA TCGTGGCCTC TAAGGACCTC CAGAGAGAGA 2500 

TTGTGGAGAG CGGCAGGGGT ACAGCATCCC CTAAAGAGTT TTATGCCAAG 2550 

AACTCTCGAT GGACAGAAGG ACTTATCTCA GCCTCCAAGG CTGTGGGCTG 2600 

GGGAGCCACT GTCATGGTGG ATGCAGCTGA TCTGGTGGTA CAAGGCAGAG 2650 

GGAAATTTGA GGAGCTAATG GTGTGTTCTC ATGAAATTGC TGCTAGCACA 2700 

GCCCAGCTTG TGGCTGCATC CAAGGTGAAA GCTGATAAGG ACAGCCCCAA 27 50 

CCTAGCCCAG CTGCAGCAGG CCTCTCGGGG AGTGAACCAG GCCACTGCCG 2800 

GCGTTGTGGC CTCAACCATT TCCGGCAAAT CACAGATCGA AGAGACAGAC 2850 

AACATGGACT TCTCAAGCAT GACGCTGACA CAGATCAAAC GCCAAGAGAT 2900 

GGATTCTCAG GTTAGGGTGC TAGAGCTAGA AAATGAATTG CAGAAGGAGC 2950 

GTCAAAAACT GGGAGAGCTT CGGAAAAAGC ACTACGAGCT TGCTGGTGTT 3000 

GCTGAGGGCT GGGAAGAAGG AACAGAGGCA TCTCCACCTA CACTGCAAGA 3050 

AGTGGTAACC GAAAAAGAAT AGAGCCAAAC CAACACCCCA TATGTCAGTG 3100 

TAAATCCTTG TTACCTATCT CGTGTGTGTT ATTTCCCCAG CCACAGGCCA 3150 

AATCCTTGGA GTCCCAGGGG CAGCCACACC ACTGCCATTA CCCAGTGCCG 3200 

AGGACATGCA TGACACTTCC CAAAGATCCC TCCATAGCGA CACCCTTTCT 3250 

GTTTGGACCC ATGGTCATCT CTGTTCTTTT CCCGCCTCCC TAGTTAGCAT 3300 
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CCAGGCTGGC CAGTGCTGCC CATGAGCAAG CCTAGGTACG AAGAGGGGTG 3350 

GTGGGGGGCA GGGCCACTCA ACAGAGAGGA CCAACATCCA GTCCTGCTGA 3400 

CTATTTGACC CCCACAACAA TGGGTATCCT TAATAGAGGA GCTGCTTGTT 3450 

GTTTGTTGAC AGCTTGGAAA GGGAAGATCT TATGCCTTTT CTTTTCTGTT 3500 

TTCTTCTCAG TCTTTTCAGT TTCATCATTT GCACAAACTT GTGAGCATCA 3550 

GAGGGCTGAT GGATTCCAAA CCAGGACACT ACCCTGAGAT CTGCACAGTC 3600 

AGAAGGACGG CAGGAGTGTC CTGGCTGTGA ATGCCAAAGC CATTCTCCCC 3650 

CTCTTTGGGC AGTGCCATGG ATTTCCACTG CTTCTTATGG TGGTTGGTTG 3700 

GGTTTTTTGG TTTTGTTTTT TTTTTTTAAG TTTCACTCAC ATAGCCAACT 3750 

CTCCCAAAGG GCACACCCCT GGGGCTGAGT CTCCAGGGCC CCCCAACTGT 3800 

GGTAGCTCCA GCGATGGTGC TGCCCAGGCC TCTCGGTGCT CCATCTCCGC 3850 

CTCCACACTG ACCAAGTGCT GGCCCACCCA GTCCATGCTC CAGGGTCAGG 3900 

CGGAGCTGCT GAGTGACAGC TTTCCTCAAA AAGCAGAAGG AGAGTGAGTG 3950 

CCTTTCCCTC CTAAAGCTGA ATCCCGGCGG AAAGCCTCTG TCCGCCTTTA 4000 

CAAGGGAGAA GACAACAGAA AGAGGGACAA GAGGGTTCAC ACAGCCCAGT 4050 

TCCCGTGACG AGGCTCAAAA ACTTGATCAC ATGCTTGAAT GGAGCTGGTG 4100 

AGATCAACAA CACTACTTCC CTGCCGGAAT GAACTGTCCG TGAATGGTCT 4150 

CTGTCAAGCG GGCCGTCTCC CTTGGCCCAG AGACGGAGTG TGGGAGTGAT 4200 

TCCCAACTCC TTTCTGCAGA CGTCTGCCTT GGCATCCTCT TGAATAGGAA 4250 

GATCGTTCCA CTTTCTACGC AATTGACAAA CCCGGAAGAT CAGATGCAAT 4300 

TGCTCCCATC AGGGAAGAAC CCTATACTTG GTTTGCTACC CTTAGTATTT 43 50 

ATTACTAACC TCCCTTAAGC AGCAACAGCC TACAAAGAGA TGCTTGGAGC 4400 

AATCAGAACT TCAGGTGTGA CTCTAGCAAA GCTCATCTTT CTGCCCGGCT 4450 

ACATCAGCCT TCAAGAATCA GAAGAAAGCC AAGGTGCTGG ACTGTTACTG 4500 

ACTTGGATCC CAAAGCAAGG AGATCATTTG GAGCTCTTGG GTCAGAGAAA 4550 

ATGAGAAAGG ACAGAGCCAG CGGCTCCAAC TCCTTTCAGC CACATGCCCC 4600 

AGGCTCTCGC TGCCCTGTGG ACAGGATGAG GACAGAGGGC ACATGAACAG 4650 

CTTGCCAGGG ATGGGCAGCC CAACAGCACT TTTCCTCTTC TAGATGGACC 4700 

CCAGCATTTA AGTGACCTTC TGATCTTGGG AAAACAGCGT CTTCCTTCTT 4750 

TATCTATAGC AACTCATTGG TGGTAGCCAT CAAGCACTTC GGAATT 4796 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 924 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ser Arg Met Trp Gly His Leu Ser Glu Gly Tyr Gly Gin Leu 
15 10 15 

Cys Ser lie Tyr Leu Lys Leu Leu Arg Thr Lys Met Glu Tyr His 
20 25 30 
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Thr Lys Asn Pro Arg Phe Pro Gly Asn Leu Gin Met Ser Asp Arg 
35 40 45 

Gin Leu Asp Glu Ala Gly Glu Ser Asp Val Asn Asn Phe Phe Gin 
50 55 60 

Leu Thr Val Glu Met Phe Asp Tyr Leu Glu Cys Glu Leu Asn Leu 
65 70 75 

Phe Gin Thr Val Phe Asn Ser Leu Asp Met Ser Arg Ser Val Ser 
80 85 90 

Val Thr Ala Ala Gly Gin Cys Arg Leu Ala Pro Leu lie Gin Val 
95 100 105 

lie Leu Asp Cys Ser His Leu Tyr Asp Tyr Thr Val Lys Leu Leu 
110 115 120 

Phe Lys Leu His Ser Cys Leu Pro Ala Asp Thr Leu Gin Gly His 
125 130 135 

Arg Asp Arg Phe Met Glu Gin Phe Thr Lys Leu Lys Asp Leu Phe 
140 145 150 

Tyr Arg Ser Ser Asn Leu Gin Tyr Phe Lys Arg Leu lie Gin lie 
155 160 165 

Pro Gin Leu Pro Glu Asn Pro Pro Asn Phe Leu Arg Ala Ser Ala 
170 175 180 

Leu Ser Glu His He Ser Pro Val Val Val He Pro Ala Glu Ala 
185 190 195 

Ser Ser Pro Asp Ser Glu Pro Val Leu Glu Lys Asp Asp Leu Met 
200 205 210 

Asp Met Asp Ala Ser Gin Gin Asn Leu Phe Asp Asn Lys Phe Asp 
215 220 225 

Asp He Phe Gly Ser Ser Phe Ser Ser Asp Pro Phe Asn Phe Asn 
230 235 240 

Ser Gin Asn Gly Val Asn Lys Asp Glu Lys Asp His Leu He Glu 
245 250 255 

Arg Leu Tyr Arg Glu He Ser Gly Leu Lys Ala Gin Leu Glu Asn 
260 265 270 

Met Lys Thr Glu Ser Gin Arg Val Val Leu Gin Leu Lys Gly His 
275 280 285 
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Val Ser Glu Leu Glu Ala Asp Leu Ala Glu Gin Gin His Leu Arg 
290 295 300 

Gin Gin Ala Ala Asp Asp Cys Glu Phe Leu Arg Ala Glu Leu Asp 
305 310 33^5 

Glu Leu Arg Arg Gin Arg Glu Asp Thr Glu Lys Ala Gin Arg Ser 
320 325 330 

Leu Ser Glu He Glu Arg Lys Ala Gin Ala Asn Glu Gin Arg Tyr 
335 340 

Ser Lys Leu Lys Glu Lys Tyr Ser Glu Leu Val Gin Asn His Ala 
350 355 

Asp Leu Leu Arg Lys Asn Ala Glu Val Thr Lys Gin Val Ser Met 
365 370 375 

Ala Arg Gin Ala Gin Val Asp Leu Glu Arg Glu Lys Lys Glu Leu 
380 385 390 

Glu Asp Ser Leu Glu Arg He Ser Asp Gin Gly Gin Arg Lys Thr 
395 400 405 

Gin Glu Gin Leu Glu Val Leu Glu Ser Leu Lys Gin Glu Leu Gly 

415 420 

Thr Ser Gin Arg Glu Leu Gin Val Leu Gin Gly Ser Leu Glu Thr 
425 430 435 

Ser Ala Gin Ser Glu Ala Asn Trp Ala Ala Glu Phe Ala Glu Leu 
440 445 450 

Glu Lys Glu Arg Asp Ser Leu Val Ser Gly Ala Ala His Arg Glu 
455 460 455 

Glu Glu Leu Ser Ala Leu Arg Lys Glu Leu Gin Asp Thr Gin Leu 
470 475 480 

Lys Leu Ala Ser Thr Glu Glu Ser Met Cys Gin Leu Ala Lys Asp 
485 490 495 

Gin Arg Lys Met Leu Leu Val Gly Ser Arg Lys Ala Ala Glu Gin 
500 505 510 

Val He Gin Asp Ala Leu Asn Gin Leu Glu Glu Pro Pro Leu He 
515 520 525 

Ser Cys Ala Gly Ser Ala Asp His Leu Leu Ser Thr Val Thr Ser 
530 535 540 
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lie Ser Ser Cys lie Glu Gin Leu Glu Lys Ser Trp Ser Gin Tyr 
545 550 555 

Leu Ala Cys Pro Glu Asp lie Ser Gly Leu Leu His Ser lie Thr 
560 565 570 

Leu Leu Ala His Leu Thr Ser Asp Ala lie Ala. His Gly Ala Thr 
575 580 585 

Thr Cys Leu Arg Ala Pro Pro Glu Pro Ala Asp Ser Leu Thr Glu 
590 595 600 

Ala Cys Lys Gin Tyr Gly Arg Glu Thr Leu Ala Tyr Leu Ala Ser 
605 610 615 

Leu Glu Glu Glu Gly Ser Leu Glu Asn Ala Asp Ser Thr Ala Met 
620 625 630 

Arg Asn Cys Leu Ser Lys lie Lys Ala lie Gly Glu Glu Leu Leu 
635 640 645 

Pro Arg Gly Leu Asp lie Lys Gin Glu Glu Leu Gly Asp Leu Val 
650 655 660 

Asp Lys Glu Met Ala Ala Thr Ser Ala Ala lie Glu Thr Cys Thr 
665 670 675 

Ala Arg lie Glu Glu Met Leu Ser Lys Ser Arg Ala Gly Asp Thr 
680 685 690 

Gly Val Lys Leu Glu Val Asn Glu Arg lie Leu Arg Cys Cys Thr 
695 700 705 

Ser Leu Met Gin Ala lie Gin Val Leu lie Val Ala Ser Lys Asp 
710 715 720 

Leu Gin Arg Glu lie Val Glu Ser Gly Arg Gly Thr Ala Ser Pro 
725 730 735 

Lys Glu Phe Tyr Ala Lys Asn Ser Arg Trp Thr Glu Gly Leu lie 
740 745 750 

Ser Ala Ser Lys Ala Val Gly Trp Gly Ala Thr Val Met Val Asp 
765 770 775 

Ala Ala Asp Leu Val Val Gin Gly Arg Gly Lys Phe Glu Glu Leu 
780 785 790 

Met Val Cys Ser His Glu He Ala Ala Ser Thr Ala Gin Leu Val 
795 800 805 



9 



wo 99/60986 PCT/US99/1 1743 

Ala Ala Ser Lys Val Lys Ala Asp Lys Asp Ser Pro Asn Leu Ala 
810 815 820 



Gin Leu Gin Gin Ala Ser Arg Gly Val Asn Gin Ala Thr Ala Gly 

825 830 835 

Val Val Ala Ser Thr lie Ser Gly Lys Ser Gin He Glu Glu Thr 

840 845 850 

Asp Asn Jyiet Asp Phe Ser Ser Met Thr Leu Thr Gin He Lys Arg 

855 860 865 

Gin Glu Met Asp Ser Gin Val Arg Val Leu Glu Leu Glu Asn Glu 

870 875 880 

Leu Gin Lys Glu Arg Gin Lys Leu Gly Glu Leu Arg Lys Lys His 

885 890 895 

Tyr Glu Leu Ala Gly Val Ala Glu Gly Trp Glu Glu Gly Thr Glu 

900 905 910 

Ala Ser Pro Pro Thr Leu Gin Glu Val Val Thr Glu Lys Glu 
915 920 924 



(2) INFORMATION FOR SEQ ID NO: 5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1090 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Leu Leu Cys Gin Gly Ser Glu Trp Arg Arg Asp Gin Gin Leu 

5 10 15 

Gly Thr Ala Asn Ala Arg Gin Trp Cys Pro Leu Pro Gin Asp Ala 
20 25 30 

Gin Pro Ala Gly Ser Trp Glu Arg Cys Pro Pro Leu Pro Pro Ala 
35 40 45 

Gly Arg Leu Gin Gly Thr Asp His Pro Trp Gly Trp Gly Arg Leu 
50 55 60 
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Ala Gly Gly Gly Glu Arg Gly Gly Leu Trp Glu Gly Leu Ser His 

65 70 75 

Ser Gin Arg Leu lie His Leu lie Leu Leu Ser Leu Pro Leu Leu 

80 85 90 

Val Phe Gin Thr Val Ser lie Asn Lys Ala lie Asn Thr Gin Glu 

95 100 105 

Val Ala Val Lys Glu Lys His Ala Arg Thr Cys lie Leu Gly Thr 

110 115 120 

His His Glu Lys Gly Ala Gin Thr Phe Trp Ser Val Val Asn Arg 

125 130 135 

Leu Pro Leu Ser Ser Asn Ala Val Leu Cys Trp Lys Phe Cys His 

140 145 150 

Val Phe His Lys Leu Leu Arg Asp Gly His Pro Asn Val Leu Lys 

155 160 165 

Asp Ser Leu Arg Tyr Arg Asn Glu Leu Ser Asp Met Ser Arg Met 

170 175 180 

Trp Gly His Leu Ser Glu Gly Tyr Gly Gin Leu Cys Ser lie Tyr 

185 190 195 

Leu Lys Leu Leu Arg Thr Lys Met Glu Tyr His Thr Lys Asn Pro 

200 205 210 

Arg Phe Pro Gly Asn Leu Gin Met Ser Asp Arg Gin Leu Asp Glu 

215 220 225 

Ala Gly Glu Ser Asp Val Asn Asn Phe Phe Gin Leu Thr Val Glu 

230 235 240 

Met Phe Asp Tyr Leu Glu Cys Glu Leu Asn Leu Phe Gin Thr Val 

245 250 255 

Phe Asn Ser Leu Asp Met Ser Arg Ser Val Ser Val Thr Ala Ala 

260 265 270 

Gly Gin Cys Arg Leu Ala Pro Leu lie Gin Val lie Leu Asp Cys 

275 288 285 



Ser His Leu Tyr Asp Tyr Thr Val Lys Leu Leu Phe Lys Leu His 
290 295 300 

Ser Cys Leu Pro Ala Asp Thr Leu Gin Gly His Arg Asp Arg Phe 
305 310 315 
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Met Glu Gin Phe Thr Lys Leu Lys Asp Leu Phe Tyr Arg Ser Ser 
320 325 330 

Asn Leu Gin Tyr Phe Lys Arg Leu lie Gin lie Pro Gin Leu Pro 
335 340 345 

Glu Asn Pro Pro Asn Phe Leu Arg Ala Ser Ala Leu Ser Glu His 
350 355 360 



lie Ser Pro Val Val Val lie Pro Ala Glu Ala Ser Ser Pro Asp 
365 370 375 

Ser Glu Pro Val Leu Glu Lys Asp Asp Leu Met Asp Met Asp Ala 
380 385 390 

Ser Gin Gin Asn Leu Phe Asp Asn Lys Phe Asp Asp lie Phe Gly 
395 400 405 



Ser Ser Phe Ser Ser Asp Pro Phe Asn Phe Asn Ser Gin Asn Gly 
410 415 420 

Val Asn Lys Asp Glu Lys Asp His Leu lie Glu Arg Leu Tyr Arg 
425 430 435 

Glu lie Ser Gly Leu Lys Ala Gin Leu Glu Asn Met Lys Thr Glu 
440 445 450 



Ser Gin Arg Val Val Leu Gin Leu Lys Gly His Val Ser Glu Leu 
455 460 465 

Glu Ala Asp Leu Ala Glu Gin Gin His Leu Arg Gin Gin Ala Ala 
470 475 480 

Asp Asp Cys Glu Phe Leu Arg Ala Glu Leu Asp Glu Leu Arg Arg 
485 490 495 



Gin Arg Glu Asp Thr Glu Lys Ala Gin Arg Ser Leu Ser Glu lie 
500 505 510 

Glu Arg Lys Ala Gin Ala Asn Glu Gin Arg Tyr Ser Lys Leu Lys 
515 520 525 

Glu Lys Tyr Ser Glu Leu Val Gin Asn His Ala Asp Leu Leu Arg 
530 535 540 

Lys Asn Ala Glu Val Thr Lys Gin Val Ser Met Ala Arg Gin Ala 
545 550 555 



Gin Val Asp Leu Glu Arg Glu Lys Lys Glu Leu Glu Asp Ser Leu 
560 565 570 
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Glu Arg lie Ser Asp Gin Gly Gin Arg Lys Thr Gin Glu Gin Leu 

575 588 585 

Glu Val Leu Glu Ser Leu Lys Gin Glu Leu Ala Thr Ser Gin Arg 

590 595 600 

Glu Leu Gin Val Leu Gin Gly Ser Leu Glu Thr Ser Ala Gin Ser 

605 610 615 

Glu Ala Asn Trp Ala Ala Glu Phe Ala Glu Leu Glu Lys Glu Arg 

620 625 630 

Asp Ser Leu Val Ser Gly Ala Ala His Arg Glu Glu Glu Leu Ser 

635 640 645 

Ala Leu Arg Lys Glu Leu Gin Asp Thr Gin Leu Lys Leu Ala Ser 

650 655 660 

Thr Glu Glu Ser Met Cys Gin Leu Ala Lys Asp Gin Arg Lys Met 

665 670 675 

Leu Leu Val Gly Ser Arg Lys Ala Ala Glu Gin Val lie Gin Asp 

680 685 690 

Ala Leu Asn Gin Leu Glu Glu Pro Pro Leu lie Ser Cys Ala Gly 

695 700 705 

Ser Ala Asp His Leu Leu Ser Thr Val Thr Ser lie Ser Ser Cys 

710 715 720 

lie Glu Gin Leu Glu Lys Ser Trp Ser Gin Tyr Leu Ala Cys Pro 



Glu Asp lie Ser Gly Leu Leu His Ser lie Thr Leu Leu Ala His 
740 745 750 

Leu Thr Ser Asp Ala lie Ala His Gly Ala Thr Thr Cys Leu Arg 
755 760 765 

Ala Pro Pro Glu Pro Ala Asp Ser Leu Thr Glu Ala Cys Lys Gin 
770 775 780 

Tyr Gly Arg Glu Thr Leu Ala Tyr Leu Ala Ser Leu Glu Glu Glu 
785 790 795 

Gly Ser Leu Glu Asn Ala Asp Ser Thr Ala Met Arg Asn Cys Leu 
800 805 810 

Ser Lys lie Lys Ala lie Gly Glu Glu Leu Leu Pro Arg Gly Leu 



725 



730 



735 



815 



820 



825 
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Asp lie Lys Gin Glu Glu Leu Gly Asp Leu Val Asp Lys Glu Met 
830 835 840 

Ala Ala Thr Ser Ala Ala lie Glu Thr Ala Thr Ala Arg lie Glu 
845 850 855 

Glu Met Leu Ser Lys Ser Arg Ala Gly Asp Thr Gly Val Lys Leu 
860 865 870 

Glu Val Asn Glu Arg lie Leu Gly Cys Cys Thr Ser Leu Met Gin 
875 888 885 

Ala lie Gin Val Leu lie Val Ala Ser Lys Asp Leu Gin Arg Glu 
890 895 900 

lie Val Glu Ser Gly Arg Gly Thr Ala Ser Pro Lys Glu Phe Tyr 
905 910 915 

Ala Lys Asn Ser Arg Trp Thr Glu Gly Leu lie Ser Ala Ser Lys 
920 925 930 

Ala Val Gly Trp Gly Ala Thr Val Met Val Asp Ala Ala Asp Leu 
935 940 945. 

Val Val Gin Gly Arg Gly Lys Phe Glu Glu Leu Met Val Cys Ser 
950 955 960 

His Glu lie Ala Ala Ser Thr Ala Gin Leu Val Ala Ala Ser Lys 
965 970 975 

Val Lys Ala Asp Lys Asp Ser Pro Asn Leu Ala Gin Leu Gin Gin 
980 985 990 

Ala Ser Arg Gly Val Asn Gin Ala Thr Ala Gly Val Val Ala Ser 
995 1000 1005 

Thr lie Ser Gly Lys Ser Gin lie Glu Glu Thr Asp Asn Met Asp 
1010 1015 1020 

Phe Ser Ser Met Thr Leu Thr Gin lie Lys Arg Gin Glu Met Asp 
1025 1030 1035 

Ser Gin Val Arg Val Leu Glu Leu Glu Asn Glu Leu Gin Lys Glu 
1040 1045 1050 

Arg Gin Lys Leu Gly Glu Leu Arg Lys Lys His Tyr Glu Leu Ala 
1055 1060 1065 

Gly Val Ala Glu Gly Trp Glu Glu Gly Thr Glu Ala Ser Pro Pro 
1070 1075 1080 
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Thr Leu Gin Glu Val Val Thr Glu Lys Glu 
1085 1090 



(2) INFORMATION FOR SEQ ID NO:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3301 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: cDNA for Huntinglin-inleracting protein 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



CGGTGAGCTG GAGGAGCAGC GGAAGCAGAA 
ATGAGCAGCT CCGCCACGAG CTGGCCCAGC 
CGCGAGCGGA GCCAGGGCCT GCGTGAGGAG 
CACGGAGGCG CGCTACAACA AGCTGAAGGA 
ATGTGCACGC GGAGCTGCTC AGAAAGAACG 
ACGGTGACGC AGCAAAGCCA GGAGGAGGTG 
GGCCTTCCAG GTGGAGCAGG TGAAGCGGGA 
AGAAGAGCGA CCAGCAGGAG AAGCTCAAGA 
GGAGAGCTGG CCCGCGCGCA GGAGGCCCTG 
GTCGGAGCTG AGCTCACGGC TGGACACACT 
TGAGTGGAGC TGTGCGGCAG CGGGAGGCAG 
CTGGTGCGCG AGACAGAGGC GGCGCTGAGC 
CCAGGAGCAG GGCGAGTTGC AGGGCCGGCT 
AGCAGGGGCT GCGGCAGAGG CTGCTGGACG 
GGCGCTGCTG CCGAGGCCGC GGGCATCCTG 
GGACGACCCC CTGCACCTGC GCTGTACCAG 
GCAGGGCCCA GGAGGCCTTG GATGCCGTGA 
GCCCAGTACC TGACCTCCTT GGCAGACGCC 
GACCCGCTTC TCCCACCTGG CTGCGGATAC 
CCTCGCACCT GGCTCCCACC GACCCTGCCG 
AGGGAGTGCG GGGCCCGGGC TCTGGAGCTC 
GCAGGCTCTG CGGCACATGC AGGCCAGCCT 
GCATCCTTCA GCTGGGCCAA GAACTGAAAC 
CAGGAGGAGC TGGGGGCCGT GGTCGACAAG 
AGCCATTGAA GATGCTGTGC GGAGGATTGA 
GCCACGCCAG CTCGGGGGTG AAGCTGGAGG 
TCCTGCACAG ACCTGATGAA GGCTATCCGG 
TAGCCTGCAG AAGGAGATCG TGGAGAGCGG 
AGGAATTTTA CGCCAAGAAC TCGCGCTGGA 
TCCAAGGCTG TGGGCTGGGG AGCCACACAG 
GGTGGTGCTT CACACGGGCA AGTATGAGGA 
AGATCGCAGC CAGCACGGCC CAGCTGGTGG 



GCAGAAGGCC 


CTGGTGGATA 


50 


TGAGGGCTGC 


CCAGCTGGAG 


100 


GCTGAGAGGA 


AGGCCAGTGC 


150 


AAAGCACAGT 


GAGCTCGTCC 


200 


CGGACACAGC 


CAAGCAGCTG 


250 


GCGCGGGTGA 


AGGAGCAGCT 


300 


GTCGGAGTTG 


AAGCTAGAGG 


350 


GGGAGCTGGA 


GGCCAAGGCC 


400 


AGCCACACAG 


AGCAGAGCAA 


450 


GAGTGCGGAG 


AAGGATGCTC 


500 


ACCTGCTGGC 


GGCGCAGAGC 


550 


CGGGAGCAGC 


AGCGCAGCTC 


600 


GGCAGAGAGG 


GAGTCTCAGG 


650 


AGCAGTTCGC 


AGTGTTGCGG 


700 


CAGGATGCCG 


TGAGCAAGCT 


750 


CTCCCCAGAC 


TACCTGGTGA 


800 


GCACCCTGGA 


GGAGGGCCAC 


850 


TCCGCCCTGG 


TGGCAGCTCT 


900 


CATCATCAAT 


GGCGGTGCCA 


950 


ACCGCCTCAT 


AGACACCTGC 


1000 


ATGGGGCAGC 


TGCAGGACCA 


1050 


GGTGCGGACA 


CCCCTGCAGG 


1100 


CCAAGAGCCT 


AGATGTGCGG 


1150 


GAGATGGCGG 


CCACATCCGC 


1200 


GGACATGATG 


AACCAGGCAC 


1250 


TGAACGAGAG 


GATCCTCAAC 


1300 


CTCCTGGTGA 


CGACATCCAC 


1350 


CAGGGGGGCA 


GCCACGCAGC 


1400 


CCGAAGGCCT 


CATCTCGGCC 


1450 


CTGGTGGAGG 


CAGCTGACAA 


1500 


GCTCATCGTC 


TGCTCCCACG 


1550 


CGGCCTCCAA 


GGTGAAGGCC 


1600 
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AACAAGCACA GCCCCCACCT 
CAATGAGAGG GCTGCCAATG 
AGATTGAGGA CAGAGACACC 
CTGAAGAAGC AGGAGATGGA 
GACGCTGGAG GCTGAACGCA 
ACGTGCTGGC TGGGGCATCA 
CCCAGCACTG CCCCCCGAAG 
GAAGCCCAGC GTGGCCCCCA 
GCATCTACCC AGCTCAACTC 
AGGGTGGCTG GTGACAGGCC 
AGAGGCCTTG CCCCTCCACC 
GATCAATGTC CTCAAGGCCC 
GCCATGTGGG TGGTGCTTCT 
GAACTTTGGG GTGCAGCCAG 
GAAAATAGTG TTTTTAATAT 
CCGAGCTAGA GCTCTTCTTC 
GGGCCAGCGT GGGGCTCCCT 
ATGGAAGGCA CACAGCCCGT 
TGTCTGTGGC CTTCTGGGGC 
GTCTCCGCAG ACCAGGCTCT 
TCCTCAGGCC CTGGCCCTCG 
GTCAGATGCT GGAAGGGGCC 
GCCCAGGCGC TGCCTTCACT 
CAGCCACCCC CATTTCCTGT 
CTGCCCACCT GAAGGGGTGG 
CTAGGCCCTC CAGCTCAGCC 
CAGGGCCCTC AGGGAGGGAC 
TTGCCCGTCA GATTTGAACG 
GCAGGAGGGG TGGGACCAGG 
CCTCCTGCTC CCCCCACCCC 
CTCTGGAAAG GCTACCAAAT 
GCCAGCACCA GCGCCTTGGC 
TGTTGGCAAT AAAATGCACT 
A 



GAGCCGCCTG CAGGAATGTT 
TGGTGGCCTC CACCAAGTCA 
ATGGATTTCT CCGGCCTGTC 
GACGCAGGTG CGTGTCCTGG 
TGCGGCTGGG GGAGTTGCGG 
GGCAGCCCTG GAGAGGAGGT 
TGTAACCACC AAGAAACCAC 
GACAGGACCA CCAGCTTGAC 
GTGAACTACT AGGCCCCCCA 
TGGGCCTCTG CAACTGCCCT 
TGGTGCCCAA GCCTCCCGCC 
CTGGCCCTTA CTGAGCCTGC 
GGATGTGAGT CTCTTATTTA 
GACCCGGTAG GCCTGAGCCT 
TCCTCTTCAG AAAATAGTGT 
CTACGTTTGT AGTCAGCACA 
GCCTTCTGGA CTCCTGAAGG 
GCCGGCTGAT GGGACGAGGG 
ACCGATTCTA CCAGGCCCTC 
GTGTGGGCTA GAGGAATGTC 
GGCCTCCGTG ATGGGAGCCC 
GCTTTCTGGG GAGTGAGGTG 
CCTGGAGTTT CCATTTCCAG 
TTTCCATTCC CCCGTTCTGG 
TTTCCAGCCC TCCGGAGAGT 
AGAAAAAGCC CAGAAACCCA 
CCTGCGGCTA GAGTGGGCTA 
AATGTGTGTC CCTTGAGCCC 
CTGGGAGGAC AGAGCCAGCA 
AGCCCTAGCC CTTTAGCCTT 
ACTGGCCAAG GTCAGGAGGA 
TTTGTGTTAG CATTTCCTCC 
TTGACTGTTA AAAAAAAAAA 



PCTAJS99/n743 


CTCGCACAGT 


1650 


GGCCAGGAGC 


1700 


CCTCATCAAG 


1750 


AGCTGGAGAA 


1800 


AAGCAACACT 


1850 


GGCCATCCGG 


1900 


CCCTGGCCCA 


1950 


AAAAAGGATG 


2000 


GGGGTCCAGC 


2050 


GACAGGACCG 


2100 


CCACCGTCTG 


2150 


AGGGTCCTGG 


2200 


TCTGCAGAAG 


2250 


CAACTCTTCA 


2300 


TTTTAATATT 


2350 


CTGGGAAACC 


2400 


TCGTGGATGG 


2450 


TCAGGCATCC 


2500 


CAGCTGCGTG 


2550 


GCCCATTACC 


2500 


CCCAGGAGGG 


2700 


AGACATAGCG 


2750 


CTGGAATCTG 


2800 


CCGCGCCCCA 


2850 


GGGCTTGGCC 


2900 


GGTGCTGGAC 


2950 


GGCCCTGGCT 


3000 


AAGGAGAGCG 


3050 


GCTGCCATGC 


3100 


TCACCCTGTG 


3150 


GCAAAAATGA 


3200 


TGAAGTGTTC 


3250 


AAAAAAAAAA 


3300 




3301 



(2) D^ORMATION FOR SEQ ID NO: 7 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENG'ra:676 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(ix) FEATURE: Huntingtin-interacting protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Gly Glu Leu Glu Glu Gin Arg Lys Gin Lys Gin Lys Ala Leu Val 
5 10 15 
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Asp Asn Glu Gin Leu Arg His Glu Leu Ala Gin Leu Arg Ala Ala 

20 25 30 

Gin Leu Glu Arg Glu Arg Ser Gin Gly Leu Arg Glu Glu Ala Glu 

35 40 45 

Arg Lys Ala Ser Ala Thr Glu Ala Arg Tyr Asn Lys Leu Lys Glu 

50 55 60 

Lys His Ser Glu Leu Val His Val His Ala Glu Leu Leu Arg Lys 

65 70 75 

Asn Ala Asp Thr Ala Lys Gin Leu Thr Val Thr Gin Gin Ser Gin 

80 85 90 

Glu Glu Val Ala Arg Val Lys Glu Gin Leu Ala Phe Gin Val Glu 

95 100 105 

Gin Val Lys Arg Glu Ser Glu Leu Lys Leu Glu Glu Lys Ser Asp 

110 115 120 

Gin Gin Glu Lys Leu Lys Arg Glu Leu Glu Ala Lys Ala Gly Glu 

125 130 135 

Leu Ala Arg Ala Gin Glu Ala Leu Ser His Thr Glu Gin Ser Lys 

140 145 150 

Ser Glu Leu Ser Ser Arg Leu Asp Thr Leu Ser Ala Glu Lys Asp 

155 160 165 

Ala Leu Ser Gly Ala Val Arg Gin Arg Glu Ala Asp Leu Leu Ala 

170 175 180 

Ala Gin Ser Leu Val Arg Glu Thr Glu Ala Ala Leu Ser Arg Glu 

185 190 195 

Gin Gin Arg Ser Ser Gin Glu Gin Gly Glu Leu Gin Gly Arg Leu 

200 205 210 

Ala Glu Arg Glu Ser Gin Glu Gin Gly Leu Arg Gin Arg Leu Leu 

215 220 225 

Asp Glu Gin Phe Ala Val Leu Arg Gly Ala Ala Ala Glu Ala Ala 

230 235 240 

Gly lie Leu Gin Asp Ala Val Ser Lys Leu Asp Asp Pro Leu His 

245 250 255 

Leu Arg Cys Thr Ser Ser Pro Asp Tyr Leu Val Ser Arg Ala Gin 

260 265 270 
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Glu Ala Leu Asp Ala Val Ser Thr Leu Glu Glu Gly His Ala Gin 
275 288 285 

Tyr Leu Thr Ser Leu Ala Asp Ala Ser Ala Leu Val Ala Ala Leu 
290 295 300 

Thr Arg Phe Ser His Leu Ala Ala Asp Thr lie lie Asn Gly Gly 
305 310 315 

Ala Thr Ser His Leu Ala Pro Thr Asp Pro Ala Asp Arg Leu lie 
320 325 330 

Asp Thr Cys Arg Glu Cys Gly Ala Arg Ala Leu Glu Leu Met Gly 
335 340 345 

Gin Leu Gin Asp Gin Gin Ala Leu Arg His Met Gin Ala Ser Leu 
350 355 360 

Val Arg Thr Pro Leu Gin Gly lie Leu Gin Leu Gly Gin Glu Leu 
365 370 375 

Lys Pro Lys Ser Leu Asp Val Arg Gin Glu Glu Leu Gly Ala Val 
380 385 390 

Val Asp Lys Glu Met Ala Ala Thr Ser Ala Ala lie Glu Asp Ala 
395 400 405 

Val Arg Arg lie Glu Asp Met Met Asn Gin Ala Arg His Ala Ser 
410 415 420 

Ser Gly Val Lys Leu Glu Val Asn Glu Arg lie Leu Asn Ser Cys 
425 430 435 

Thr Asp Leu Met Lys Ala lie Arg Leu Leu Val Thr Thr Ser Thr 
440 445 450 

Ser Leu Gin Lys Glu lie Val Glu Ser Gly Arg Gly Ala Ala Thr 
455 460 465 

Gin Gin Glu Phe Tyr Ala Lys Asn Ser Arg Trp Thr Glu Gly Leu 
470 475 480 

lie Ser Ala Ser Lys Ala Val Gly Trp Gly Ala Thr Gin Leu Val 
485 490 495 

Glu Ala Ala Asp Lys Val Val Leu His Thr Gly Lys Tyr Glu Glu 
500 505 510 

Leu lie Val Cys Ser His Glu lie Ala Ala Ser Thr Ala Gin Leu 
515 520 525 
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Val Ala Ala Ser Lys Val Lys Ala Asn Lys His Ser Pro His Leu 

530 535 540 

Ser Arg Leu Gin Glu Cys Ser Arg Thr Val Asn Glu Arg Ala Ala 

545 550 555 

Asn Val Val Ala Ser Thr Lys Ser Gly Gin Glu Gin lie Glu Asp 

560 565 570 

Arg Asp Thr Met Asp Phe Ser Gly Leu Ser Leu lie Lys Leu Lys 

575 588 585 

Lys Gin Glu Met Glu Thr Gin Val Arg Val Leu Glu Leu Glu Lys 

590 595 600 

Thr Leu Glu Ala Glu Arg Met Arg Leu Gly Glu Leu Arg Lys Gin 

605 610 615 

His Tyr Val Leu Ala Gly Ala Ser Gly Ser Pro Gly Glu Glu Val 

620 625 630 

Ala lie Arg Pro Ser Thr Ala Pro Arg Ser Val Thr. Thr Lys Lys 

635 640 645 

Pro Pro Leu Ala Gin Lys Pro Ser Val Ala Pro Arg Gin Asp His 

650 655 660 

Gin Leu Asp Lys Lys Asp Gly lie Tyr Pro Ala Gin Leu Val Asn 

665 670 675 

Tyr 



(2) INFORMATION FOR SEQ ID N0:8: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2338 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
{ii)MOLECULETYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: mouse 

(ix) FEATURE: cDNA for Huntingtin-interacting protein - mHIPl 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGCACGAGGG CTCATTCAGA TCCCCCAGCT GCCCGAGAAT CCACCCAACTT 50 
CCTACGAGCC TCGGCCCTGT CAGAGCACAT CAGTCCTGTG GTGGTGATCCC 100 
GGCAGAGGTG TCATCCCCAG ACAGTGAGCC TGTCCTGGAG AAGGATGACCT 150 
CATGGACATG GACGCCTCCC AGCAGACTTT GTTTGACAAC AAGTTTGATGA 200 
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CGTCTTTGGC AGCTCATTGA GCAGCGACCC TTTCAATTTC AACAATCAAAA 250 
TGGCGTGAAC AAGGACGAGA AGGACCACTT GATTGAACGC CTGTACAGAGA 300 
GATCAGTGGA CTGACAGGGC AGCTGGACAA CATGAAGATT GAGAGCCAGCG 350 
GGCCATGCTG CAGCTGAAGG GTCGAGTGAG TGAGCTGGAG GCAGAGCTAGC 400 
AGAGCAGCAG CACTTGGGCC GGCAGGCTAT GGATGACTGC GAGTTCCTGCG 450 
CACTGAGCTG GATGAACTGA AGAGGCAGCG AGAGGACACG GAGAAGGCACA 500 
GCGCAGCCTG ACTGAGATAG AAAGAAAGGC CCAGGCTAAT GAACAGAGGTA 550 
TAGCAAGTTA AAAGAGAAGT ACAGTGAACT GGTGCAGAAC CATGCTGACCT 600 
GCTGCGGAAG AACGCAGAGG TGACCAAACA GGTGTCCGTG GCCCGGCAAGC 650 
CCAGGTGGAT TTGGAAAGAG AGAAAAAAGA GCTAGCAGAT TCCTTTGCAC 700 
GTGTAAGTGA CCAGGCCCAG CGGAAGACTC AAGAGCAACA GGATGTTCTA 750 
GAGAACCTGA AGCATGAACT GGCCACCAGC AGACAGGAGC TGCAGGTCCT 800 
CCACAGCAAC CTGGAAACCT CTGCCCAGTC AGAAGCGAAA TGGCTGACAC 850 
AGATCGCCGA GTTGGAGAAG GAACAAGGCA GCTTGGCGAC TGTTGCAGCT 900 
CAGAGAGAGG AAGAGTTATC AGCCCTCCGA GACCAGCTGG AAAGCACCCA 950 
GATCAAGCTG GCTGGGGCCC AGGAATCCAT GTGCCAGCAG GTGAAGGACC 1000 
AGAGGAAAAC CCTCTTGGCA GGGATCAGGA AGGCTGCGGA GCGTGAGATA 1050 
CAGGAGGCGC TGAGCCAGCT TGAGGAACCC ACCCTCATCA GCTGTGCAGG 1100 
ATCCACAGAT CACCTTCTCT CCAAAGTCAG CTCCGTTTCC AGCTGCCTCG 1150 
AGCAACTGGA AAAGAACGGC AGCCAGTATC TGGCCTGCCC AGAAGATATT 1200 
AGTGAGCTTC TGCACTCGAT CACCCTGCTT GCCCACTTGA CCGGTGACAC 1250 
TGTCATCCAG GGGAGTGCCA CCAGCCTCCG GGCCCCACCG GAGCCAGCCG 1300 
ACTCGTTGAC GGAGGCCTGT AGGCAGTATG GCAGAGAAAC CCTGGCCTAT 1350 
CTGTCCTCCC TGGAGGAAGA GGGAACTGTG GAGAATGCTG ACGTCACAGC 1400 
CCTTAGGAAT TGCCTCAGCA GGGTCAAGAC CCTTGGCGAG GAGCTGCTGC 1450 
CCAGGGGCCT GGACATCAAG CAGGAAGAGC TGGGTGACCT GGTGGACAAG 1500 
GAGATGGCAG CCACTTCAGC TGCCATTGAA GCTGCCACCA CCCGGATAGA 1550 
GGAAATTCTC AGTAAGTCCC GAGCAGGAGA CACGGGAGTC AAGCTGGAGG 1600 
TGAATGAGAG GATCCTGGGT TCCTGTACCA GCCTGATGCA GGCCATCAAG 1650 
GTGCTCGTTG TGGCCTCCAA GGACCTCCAG AAGGAGATAG TGGAGAGTGG 1700 
CAGGGGTAGT GCATCCCCTA AAGAATTTTA CGCCAAGAAC TCTCGGTGGA 1750 
CGGAAGGGCT GATATCCGCC TCCAAAGCTG TTGGTTGGGG AGCTACCATC 1800 
ATGGTGGATG CTGCTGATCT TGTGGTCCAA GGCAAAGGGA AGTTCGAGGA 1850 
GCTGATGGTG TGTTCACGCG AGATTGCTGC CAGTACTGCC CAGCTCGTGG 1900 
CTGCATCCAA GGTGAAAGCG AACAAGGGCA GCCTCAATCT GACCCAGCTG 2000 
CAGCAGGCCT CTCGAGGAGT GAACCAGGCC ACAGCCGCTG TGGTGGCCTC 2050 
AACCATTTCT GGCAAATCTC AGATTGAGGA AACAGACAGT ATGGACTTCT 2100 
CAAGCATGAC ACTGACCCAG ATCAAGCGCC AGGAGATGGA TTCCCAGGTT 2150 
AGGGTGCTGG AGCTGGAAAA TGACCTGCAG AAGGAGCGTC AGAAACTAGG 2200 
AGAGCTACGG AAGAAACACT ACGAGCTGGA GGGCGTGGCT GAGGGCTGGG 2250 
AGGAAGGGAC AGAAGCATCA CCGTCTACTG TCCAAGAAGC AATACCGGAC 2300 
AAAGAGTAGA GCCAAGCCGA CACCCCACAC ATCAGAAA 2338 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 676 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(iii) HYPOTHETICAL: no 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: mouse 

(ix) FEATURE: Huntingtin-interacting protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Ala Arg Gly Leu lie Gin He Pro Gin Leu Pro Glu Asn Pro Pro 

5 10 15 

Asn Phe Leu Arg Ala Ser Ala Leu Ser Glu His He Ser Pro Val 

20 25 30 

Val Val He Pro Ala Glu Val Ser Ser Pro Asp Ser Glu Pro Val 

35 40 45 

Leu Glu Lys Asp Asp Leu Met Asp Met Asp Ala Ser Gin Gin Thr 

50 55 60 

Leu Phe Asp Asn Lys Phe Asp Asp Val Phe Gly Ser Ser Leu Ser 

65 70 75 

Ser Asp Pro Phe Asn Phe Asn Asn Gin Asn Gly Val Asn Lys Asp 

80 85 90 

Glu Lys Asp His Leu He Glu Arg Leu Tyr Arg Glu He Ser Gly 

95 100 105 

Leu Thr Gly Gin Leu Asp Asn Met Lys He Glu Ser Gin Arg Ala 

110 115 120 

Met Leu Gin Leu Lys Gly Arg Val Ser Glu Leu Glu Ala Glu Leu 

125 130 135 

Ala Glu Gin Gin His Leu Gly Arg Gin Ala Met Asp Asp Cys Glu 

140 145 150 

Phe Leu Arg Thr Glu Leu Asp Glu Leu Lys Arg Gin Arg Glu Asp 

155 160 165 

Thr Glu Lys Ala Gin Arg Ser Leu Thr Glu He Glu Arg Lys Ala 

170 175 180 

Gin Ala Asn Glu Gin Arg Tyr Ser Lys Leu Lys Glu Lys Tyr Ser 

185 190 195 

Glu Leu Val Gin Asn His Ala Asp Leu Leu Arg Lys Asn Ala Glu 

200 205 210 

Val Thr Lys Gin Val Ser Val Ala Arg Gin Ala Gin Val Asp Leu 

215 220 225 

Glu Arg Glu Lys Lys Glu Leu Ala Asp Ser Phe Ala Arg Val Ser 
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230 235 240 

Asp Gin Ala Gin Arg Lys Thr Gin Glu Gin Gin Asp Val Leu Glu 
245 250 255 

Asn Leu Lys His Glu Leu Ala Thr Ser Arg Gin Glu Leu Gin Val 
260 265 270 

Leu His Ser Asn Leu Glu Thr Ser Ala Gin Ser Glu Ala Lys Trp 
275 288 285 

Leu Thr Gin lie Ala Glu Leu Glu Lys Glu Gin Gly Ser Leu Ala 
290 295 300 

Thr Val Ala Ala Gin Arg Glu Glu Glu Leu Ser Ala Leu Arg Asp 
305 310 315 

Gin Leu Glu Ser Thr Gin lie Lys Leu Ala Gly Ala Gin Glu Ser 
320 325 330 

Met Cys Gin Gin Val Lys Asp Gin Arg Lys Thr Leu Leu Ala Gly 
335 340 345 

lie Arg Lys Ala Ala Glu Arg Glu lie Gin Glu Ala Leu Ser Gin 
350 355 360 

Leu Glu Glu Pro Thr Leu lie Ser Cys Ala Gly Ser Thr Asp His 
365 370 375 

Leu Leu Ser Lys Val Ser Ser Val Ser Ser Cys Leu Glu Gin Leu 
380 385 390 

Glu Lys Asn Gly Ser Gin Tyr Leu Ala Cys Pro Glu Asp lie Ser 
395 400 405 

Glu Leu Leu His Ser lie Thr Leu Leu Ala His Leu Thr Gly Asp 
410 415 420 

Thr Val lie Gin Gly Ser Ala Thr Ser Leu Arg Ala Pro Pro Glu 
425 430 435 

Pro Ala Asp Ser Leu Thr Glu Ala Cys Arg Gin Tyr Gly Arg Glu 
440 445 450 

Thr Leu Ala Tyr Leu Ser Ser Leu Glu Glu Glu Gly Thr Val Glu 
455 460 465 



Asn Ala Asp Val Thr Ala Leu Arg Asn Cys Leu Ser Arg Val Lys 
470 475 480 
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Thr Leu Gly Glu Glu Leu Leu Pro Arg Gly Leu Asp lie Lys Gin 
485 490 495 

Glu Glu Leu Gly Asp Leu Val Asp Lys Glu Met Ala Ala Thr Ser 
500 505 510 

Ala Ala lie Glu Ala Ala Thr Thr Arg He Glu Glu He Leu Ser 
515 520 525 

Lys Ser Arg Ala Gly Asp Thr Gly Val Lys Leu Glu Val Asn Glu 
530 535 540 

Arg He Leu Gly Ser Cys Thr Ser Leu Met Gin Ala He Lys Val 
545 550 555 

Leu Val Val Ala Ser Lys Asp Leu Gin Lys Glu He Val Glu Ser 
560 565 570 

Gly Arg Gly Ser Ala Ser Pro Lys Glu Phe Tyr Ala Lys Asn Ser 
575 588 585 

Arg Trp Thr Glu Gly Leu He Ser Ala Ser Lys Ala Val Gly Trp 
590 595 600 

Gly Ala Thr He Met Val Asp Ala Ala Asp Leu Val Val Gin Gly 
605 610 615 

Lys Gly Lys Phe Glu Glu Leu Met Val Cys Ser Arg Glu He Ala 
620 625 630 

Ala Ser Thr Ala Gin Leu Val Ala Ala Ser Lys Val Lys Ala Asn 
635 640 645 

Lys Gly Ser Leu Asn Leu Thr Gin Leu Gin Gin Ala Ser Arg Gly 
650 655 660 

Val Asn Gin Ala Thr Ala Ala Val Val Ala Ser Thr He Ser Gly 
665 670 675 

Lys Ser Gin He Glu Glu Thr Asp Ser Met Asp Phe Ser Ser Met 
680 685 690 

Thr Leu Thr Gin He Lys Arg Gin Glu Met Asp Ser Gin Val Arg 
695 700 705 

Val Leu Glu Leu Glu Asn Asp Leu Gin Lys Glu Arg Gin Lys Leu 
710 715 720 

Gly Glu Leu Arg Lys Lys His Tyr Glu Leu Glu Gly Val Ala Glu 
725 730 735 
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Gly Trp Glu Glu Gly Thr Glu Ala Ser Pro Ser Thr Val Gin Glu 
740 745 750 

Ala lie Pro Asp Lys Glu 
755 



(2) INFORMATION FOR SEQ ID NO; 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3964 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: mouse 

(ix) FEATURE: cDNA for Huntingtin-interacting protein - mHIPla 
(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GGCACGAGGC GGCGCGCGGC CTCCGTGTGC CTAGGCTTGA GGCGGGCGGT 50 

GACGCCTCAT TCGCGCGGAG CCGGGCCGGG ACACGGTCGG CGGCAGCATG 100 

AACAGCATCA AGAATGTGCC GGCGCGGGTG CTGAGCCGCA GGCCGGGCCA 150 

CAGCCTAGAG GCCGAGCGCG AGCAGTTCGA CAAGACGCAG GCCATCAGTA 200 

TCAGCAAAGC CATCAACAGC CAGGAGGCCC CAGTGAAGGA GAAGCATGCC 250 

CGGCGTATCA TCCTGGGCAC GCATCATGAG AAGGGAGCCT TCACCTTCTG 300 

GTCCTATGCC ATCGGCCTGC CGCTGTCCAG CAGCTCCATC CTCAGCTGGA 350 

AGTTCTGTCA CGTCCTTCAC AAGGTCCTCC GGGACGGACA CCCCAACGTC 400 

CTGCATGACT ATCAGCGGTA CCGGAGCAAC ATACGTGAGA TCGGTGACTT 450 

GTGGGGCCAC CTTCGTGACC AGTATGGACA CCTGGTGAAT ATCTATACCA 500 

AACTGTTGCT GACTAAGATC TCCTTCCACC TTAAGCACCC CCAGTTTCCT 550 

GCAGGCCTGG AGGTAACAGA TGAGGTGTTG GAGAAGGCGG CGGGAACTGA 600 

TGTCAACAAC ATTTTTCAGC TTACCGTGGA GATGTTTGAC TACATGGACT 650 

GTGAACTGAA GCTTTCTGAG TCAGTTTTCC GGCAGCTCAA CACGGCCATC 700 

GCAGTGTCCC AGATGTCTTC TGGCCAGTGT CGCCTAGCGC CGCTCATCCA 750 

GGTCATTCAG GACTGCAGCC ACCTGTACCA CTACACAGTG AAGCTCATGT 800 

TTAAGCTGCA CTCCTGTCTC CCGGCAGACA CCCTGCAAGG CCACAGGGAT 850 

CGGTTCCACG AGCAGTTCCA CAGCCTCAAA AACTTCTTCC GCCGGGCTTC 900 

AGACATGCTG TACTTCAAGA GGCTCATCCA GATCCCGCGG CTGCCTGAGG 950 

GACCCCCCAA TTTCCTGCGG GCTTCAGCCC TGGCTGAGCA CATCAAGCCG 1000 

GTGGTGGTGA TTCCCGAGGA GGCCCCAGAG GAAGAGGAGC CTGAGAACCT 1050 

AATTGAAATC AGCAGTGCGC CCCCTGCTGG GGAGCCAGTG GTGGTGGCTG 1100 

ACCTCTTTGA TCAGACCTTT GGACCCCCCA ATGGCTCCAT GAAGGATGAC 1150 

AGGGACCTCC AAATCGAGAA CTTGAAGAGA GAGGTGGAGA CCCTCCGTGC 1200 

TGAGCTGGAG AAGATTAAGA TGGAGGCACA GCGGTACATC TCCCAGCTGA 1250 

AGGGCCAGGT GAATGGCCTG GAGGCAGAGC TGGAGGAGCA GCGCAAGCAG 1300 

AAGCAGAAGG CCCTGGTGGA CAACGAGCAG CTGCGCCACG AGCTGGCCCA 1350 

GCTCAAGGCC CTGCAGCTGG AGGGCGCCCG CAACCAGGGC CTTCGAGAGG 1400 

AAGCAGAGAG GAAGGCCAGT GCCACGGAGG CACGCTACAG CAAGCTGAAG 1450 

GAGAAACACA GCGAACTCAT TAACACGCAC GCCGAGCTGC TCAGGAAGAA 1500 
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CGCAGACACG GCCAAGCAGC TGACAGTGAC ACAGCAGAGC CAGGAGGAGG 1550 

TGGCACGGGT AAAGGAACAG CTGGCCTTCC AGATGGAGCA AGCGAAGCGT 1600 

GAGTCTGAGA TGAAGATGGA AGAGCAGAGC GACCAGTTGG AGAAGCTCAA 1650 

GAGGGAGCTG GCGGCCAGGG CAGGAGAGCT GGCCCGTGCG CAGGAGGCCC 1700 

TGAGCCGCAC AGAACAGAGT GGGTCAGAGC TGAGCTCACG GCTGGACACA 1750 

CTGAACGCGG AGAAGGAAGC CCTGAGTGGA GTCGTTCGGC AGCGTGAGGC 1800 

AGAGCTGCTG GCCGCTCAGA GCCTGGTGCG GGAGAAGGAG GAGGCGCTTA 1850 

GCCAAGAGCA GCAGCGGAGC TCCCAGGAGA AGGGCGAGCT ACGGGGGCAG 1900 

CTGGCAGAAA AGGAGTCTCA GGAGCAGGGG CTTCGGCAGA AGCTGCTGGA 1950 

TGAGCAGTTG GCGGTGTTGC GAAGTGCAGC CGCCGAGGCA GAGGCCATCC 2000 

TACAGGATGC AGTGAGCAAG CTGGACGACC CCCTGCACCT CCGCTGCACC 2050 - 

AGCTCCCCAG ACTACTTGGT GAGCCGGGCT CAGGCAGCCC TGGACAGCGT 2100 

GAGCGGCCTG GAGCAGGGCC ACACCCAGTA CCTGGCTTCC TCCGAAGATG 2150 

CTTCTGCCCT GGTGGCAGCG CTGACCCGCT TCTCCCATTT GGCTGCGGAC 2200 

ACCATTGTCA ATGGTGCCGC CACCTCCCAC CTGGCCCCCA CCGACCCCGC 2250 

CGACCGCCTG ATGGACACAT GCAGGGAGTG TGGAGCCCGG GCTCTGGAGC 2300 

TGGTGGGACA GCTGCAAGAC CAGACAGTGC TACGGAGGGC TCAGCCCAGC 2350 

CTGATGCGGG CCCCCCTGCA GGGCATTCTG CAGTTGGGCC AGGACTTGAA 2400 

GCCTAAGAGC CTGGATGTAC GGCAAGAGGA GCTAGGGGCC ATGGTGGACA 2450 

AGGAGATGGC GGCCACCTCG GCAGCCATTG AGGACGCTGT GCGGAGGATC 2500 

GAGGACATGA TGAGCCAGGC CCGCCACGAG AGCTCAGGCG TGAAACTGGA 2550 

GGTGAATGAG AGGATCCTCA ACTCCTGCAC AGACCTGATG AAGGCTATCC 2600 

GGCTCCTGGT GATGACCTCC ACCAGCCTGC AGAAGGAAAT TGTGGAGAGC 2650 

GGCAGGGGGG CAGCAACGCA GCAGGAATTT TATGCCAAGA ATTCACGGTG 2700 

GACTGAAGGC CTCATCTCAG CCTCTAAGGC AGTGGGCTGG GGAGCCACAC 2750 

AGCTGGTGGA GTCAGCTGAC AAGGTTGTGC TTCACATGGG CAAATACGAG 2800 

GAACTCATCG TCTGCTCCCA TGAGATTGCG GCCAGCACGG CCCAGCTGGT 2850 

GGCAGCCTCG AAGGTGAAAG CCAACAAGAA CAGTCCCCAC TTGAGCCGCC 2900 

TGCAGGAATG TTCCCGCACT GTCAACGAGA GGGCTGCCAA CGTCGTGGCC 2950 

TCCACCAAAT CTGGCCAGGA GCAGATTGAG GACAGAGACA CCATGGATTT 3000 

CTCTGGCCTG TCCCTCATCA AGTTGAAGAA GCAGGAGATG GAGACACAGG 3050 

TGCGAGTCTT GGAGCTGGAG AAGACACTAG AGGCAGAGCG TGTCCGGCTC 3100 

GGGGAGCTTC GGAAACAGCA CTATGTACTG GCTGGGGGGA TGGGAACACC 3150 

TAGCGAAGAA GAACCCAGCA GACCCAGCCC AGCTCCCCGA AGTGGGGCCA 3200 

CTAAGAAGCC ACCGCTGGCC CAGAAACCCA GCATAGCCCC CAGGACAGAC 3250 

AACCAGCTCGA CAAAAAGGAT GGTGTCTACC CAGCTCAACT TGTGAACTAC 3300 

TAGGCCCCTAA GGTGTTCAGC AGGATGGCTG GTGGTTGTGC CTGGGCTTCA 3350 

TGTGGCTGTCT GGCAGTGGTC AAGGGGCCTC TGAGAAGCCT CCAACTCCTG 3400 

CCCAAGGGGCC TAGTCTGTGG GACAGTTCAT CTGGATGTGA ATCTATTTAT 3450 

CTTAAGTAGGA ACTGCCTCGA GCAGCTGGGA CCCAGCAGGC CTGAGCCACA 3500 

AATCTGCAGCG GACATCAGAG ATAGTCTGAA TGCTGCGAGG TATTTCTTTC 3550 

TTCGTAAGTTT AGTCAGCACA CTGGGAAAAG GTCACATAAG CCAGGAGCCT 3600 

CCTTGTCTCTG GACTCAAAAG TCTGAGGCCT TAAGTGAACA ACAGAAAGAG 3650 

GGTCCCTGCTG GCTACCAGGG ATAAGGGGAT GACCTGTGAC CCTTGAGCCA 3700 

GGGAGAGCAGG TAAGCTGGGT GGTGTCATCA CCTGGGGGCC TGGTGCTAGG 3750 

GCATCCATGCT GGGAGCCCCA GGAGACCAGG CTTTGTGTGG GAGCCTGGCA 3800 

TCATCGTGGCT GGGGCAGCCC CTGCTCAGGT GCTGTCTCTG CCCGTGACCT 3850 

TGAAGCCACCC TCCCCCCGTA CAGTTTTCCA TTCTCCTGGC TACTAGTGTG 3900 

GCTGTTCATTG CCTACCTTGA TGAGTAGATT TCAGCCCTCC TAAAGCTGGG 3950 

GCCTTTCCTCG TGCC 3964 
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(2) INFORMATION FOR SEQ ID NO: 1 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 676 

(B) TYPE: protein 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: protein 

(iii) HYPOTHETICAL: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: mouse 

(ix) FEATURE: Huntingtin-interacting protein -mHIPla 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 1 : 

Met Asn Ser lie Lys Asn Val Pro Ala Arg Val Leu Ser Arg Arg 

5 10 15 

Pro Gly His Ser Leu Glu Ala Glu Arg Glu Gin Phe Asp Lys Thr 

20 25 30 

Gin Ala lie Ser lie Ser Lys Ala lie Asn Ser Gin Glu Ala Pro 

35 40 45 

Val Lys Glu Lys His Ala Arg Arg lie lie Leu Gly Thr His His 

50 55 60 

Glu Lys Gly Ala Phe Thr Phe Trp Ser Tyr Ala lie Gly Leu Pro 

65 70 75 

Leu Ser Ser Ser Ser lie Leu Ser Trp Lys Phe Cys His Val Leu 

80 85 90 

His Lys Val Leu Arg Asp Gly His Pro Asn Val Leu His Asp Tyr 

95 100 105 

Gin Arg Tyr Arg Ser Asn lie Arg Glu lie Gly Asp Leu Trp Gly 

110 115 120 

His Leu Arg Asp Gin Tyr Gly His Leu Val Asn lie Tyr Thr Lys 

125 130 135 

Leu Leu Leu Thr Lys He Ser Phe His Leu Lys His Pro Gin Phe 

140 145 150 

Pro Ala Gly Leu Glu Val Thr Asp Glu Val Leu Glu Lys Ala Ala 

155 160 165 

Gly Thr Asp Val Asn Asn lie Phe Gin Leu Thr Val Glu Met Phe 

170 175 180 

Asp Tyr Met Asp Cys Glu Leu Lys Leu Ser Glu Ser Val Phe Arg 
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185 190 195 



Gin Leu Asn Thr Ala lie Ala Val Ser Gin Met Ser Ser Gly Gin 
200 205 210 

Cys Arg Leu Ala Pro Leu lie Gin Val lie Gin Asp Cys Ser His 
215 220 225 

Leu Tyr His Tyr Thr Val Lys Leu Met Phe Lys Leu His Ser Cys 
230 235 240 

Leu Pro Ala Asp Thr Leu Gin Gly His Arg Asp Arg Phe His Glu 
245 250 255 

Gin Phe His Ser Leu Lys Asn Phe Phe Arg Arg Ala Ser Asp Met 
260 265 270 

Leu Tyr Phe Lys Arg Leu lie Gin lie Pro Arg Leu Pro Glu Gly 
275 288 285 

Pro Pro Asn Phe Leu Arg Ala Ser Ala Leu Ala Glu His lie Lys 
290 295 300 

Pro Val Val Val He Pro Glu Glu Ala Pro Glu Glu Glu Glu Pro 
305 310 315 

Glu Asn Leu He Glu He Ser Ser Ala Pro Pro Ala Gly Glu Pro 
320 325 330 

Val Val Val Ala Asp Leu Phe Asp Gin Thr Phe Gly Pro Pro Asn 
335 340 345 

Gly Ser Met Lys Asp Asp Arg Asp Leu Gin He Glu Asn Leu Lys 
350 355 360 

Arg Glu Val Glu Thr Leu Arg Ala Glu Leu Glu Lys He Lys Met 
365 370 375 

Glu Ala Gin Arg Tyr He Ser Gin Leu Lys Gly Gin Val Asn Gly 
380 385 390 

Leu Glu Ala Glu Leu Glu Glu Gin Arg Lys Gin Lys Gin Lys Ala 
395 400 405 

Leu Val Asp Asn Glu Gin Leu Arg His Glu Leu Ala Gin Leu Lys 
410 415 420 

Ala Leu Gin Leu Glu Gly Ala Arg Asn Gin Gly Leu Arg Glu Glu 
425 430 435 

Ala Glu Arg Lys Ala Ser Ala Thr Glu Ala Arg Tyr Ser Lys Leu 
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450 



Lys Glu Lys His Ser Glu Leu lie Asn Thr His Ala Glu Leu Leu 
455 460 465 

Arg Lys Asn Ala Asp Thr Ala Lys Gin Leu Thr Val Thr Gin Gin 
470 475 480 

Ser Gin Glu Glu Val Ala Arg Val Lys Glu Gin Leu Ala Phe Gin 
485 490 495 

Met Glu Gin Ala Lys Arg Glu Ser Glu Met Lys Met Glu Glu Gin 
500 505 510 

Ser Asp Gin Leu Glu Lys Leu Lys Arg Glu Leu Ala Ala Arg Ala 
515 520 525 

Gly Glu Leu Ala Arg Ala Gin Glu Ala Leu Ser Arg Thr Glu Gin 
530 535 540 

Ser Gly Ser Glu Leu Ser Ser Arg Leu Asp Thr Leu Asn Ala Glu 
545 550 555 

Lys Glu Ala Leu Ser Gly Val Val Arg Gin Arg Glu Ala Glu Leu 
560 565 570 

Leu Ala Ala Gin Ser Leu Val Arg Glu Lys Glu Glu Ala Leu Ser 
575 588 585 

Gin Glu Gin Gin Arg Ser Ser Gin Glu Lys Gly Glu Leu Arg Gly 
590 595 600 

Gin Leu Ala Glu Lys Glu Ser Gin Glu Gin Gly Leu Arg Gin Lys 
605 610 615 

Leu Leu Asp Glu Gin Leu Ala Val Leu Arg Ser Ala Ala Ala Glu 
620 625 630 

Ala Glu Ala lie Leu Gin Asp Ala Val Ser Lys Leu Asp Asp Pro 
635 640 645 

Leu His Leu Arg Cys Thr Ser Ser Pro Asp Tyr Leu Val Ser Arg 
650 655 660 

Ala Gin Ala Ala Leu Asp Ser Val Ser Gly Leu Glu Gin Gly His 
665 670 675 

Thr Gin Tyr -Leu Ala Ser Ser Glu Asp Ala Ser Ala Leu Val Ala 
680 685 690 

Ala Leu Thr Arg Phe Ser His Leu Ala Ala Asp Thr lie Val Asn 
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695 700 705 

Gly Ala Ala Thr Ser His Leu Ala Pro Thr Asp Pro Ala Asp Arg 
710 715 720 

Leu Met Asp Thr Cys Arg Glu Cys Gly Ala Arg Ala Leu Glu Leu 
725 730 735 

Val Gly Gin Leu Gin Asp Gin Thr Val Leu Arg Arg Ala Gin Pro 
740 745 750 

Ser Leu Met Arg Ala Pro Leu Gin Gly lie Leu Gin Leu Gly Gin 
755 760 765 

Asp Leu Lys Pro Lys Ser Leu Asp Val Arg Gin Glu Glu Leu Gly 
770 775 780 

Ala Met Val Asp Lys Glu Met Ala Ala Thr Ser Ala Ala He Glu 
785 790 795 

Asp Ala Val Arg Arg He Glu Asp Met Met Ser Gin Ala Arg His 
800 805 810 

Glu Ser Ser Gly Val Lys Leu Glu Val Asn Glu Arg He Leu Asn 
815 820 825 

Ser Cys Thr Asp Leu Met Lys Ala He Arg Leu Leu Val Met Thr 
830 835 840 

Ser Thr Ser Leu Gin Lys Glu He Val Glu Ser Gly Arg Gly Ala 
845 850 855 

Ala Thr Gin Gin Glu Phe Tyr Ala Lys Asn Ser Arg Trp Thr Glu 
860 865 870 

Gly Leu He Ser Ala Ser Lys Ala Val Gly Trp Gly Ala Thr Gin 
875 888 885 

Leu Val Glu Ser Ala Asp Lys Val Val Leu His Met Gly Lys Tyr 
890 895 900 

Glu Glu Leu He Val Cys Ser His Glu He Ala Ala Ser Thr Ala 
905 910 915 

Gin Leu Val Ala Ala Ser Lys Val Lys Ala Asn Lys Asn Ser Pro 
920 925 930 

His Leu Ser Arg Leu Gin Glu Cys Ser Arg Thr Val Asn Glu Arg 
935 940 945 

Ala Ala Asn Val Val Ala Ser Thr Lys Ser Gly Gin Glu Gin He 



29 



wo 99/60986 

950 



955 



PCTAJS99/11743 

960 



Glu Asp Arg Asp Thr Met Asp Phe Ser Gly Leu Ser Leu lie Lys 

965 970 975 

Leu Lys Lys Gin Glu Met Glu Thr Gin Val Arg Val Leu Glu Leu 

980 985 990 

Glu Lys Thr Leu Glu Ala Glu Arg Val Arg Leu Gly Glu Leu Arg 

995 1100 1105 

Lys Gin His Tyr Val Leu Ala Gly Gly Met Gly Thr Pro Ser Glu 

1110 1115 1120 

Glu Glu Pro Ser Arg Pro Ser Pro Ala Pro Arg Ser Gly Ala Thr 

1125 1130 1135 

Lys Lys Pro Pro Leu Ala Gin Lys Pro Ser lie Ala Pro Arg Thr 

1140 1145 1150 

Asp Asn Gin Leu Asp Lys Lys Asp Gly Val Tyr Pro Ala Gin Leu 

1155 1160 1165 

Val Asn Tyr 



(2) INFORMATION FOR SEQ ID NO: 12: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DN A 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
G A AGAT ACCC C ACC A A AC 1 8 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 
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(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GCTTGACAGT GTAGTCATAA AGGTGGCTGC AGTCC 35 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGACATGTCC AGGGAGTTGA ATAC 24 



(2) INFORMATION FOR SEQ ID NO: 15: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: yes 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CUACUACUAC UACUAGGCCA CGCGTCGACT AGTACGGGH GGGnGGGH G 41 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 516 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 
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(x) FEATURE: exon 1 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

TCTGTGGAAG GTTTGGAGGG GAGAGAGGGG CAGCTGGATG CTCTTGGGCC ACGGTCGCCC 60 

CTGATCTCTG CGCCTCTTCC TCCTGCTCCG GGAGAAATAA TGTTTCCCTG GGGGATGAAA 120 

GCATCTCTTT GTGCGGGCTT TAATTGCCAT GTTGTTGTGC CAAGGGAGTG AGTGGCGGCG 180 

GGACCAGCAG CTGGGCACAG CCAATGCCAG GCAGTGGTGC CCACTCCCTC AGGACGCCCA 240 

GCCAGCTGGC TCCTGGGAGC GCTGCCCACC TCTGCCCCCA GCTGGGCGCC TGCAAGGAAC 300 

CGACCACCCG TGGGGCTGGG GGAGGTTGGC TGGAGGAGGA GAAAGGGGCG GGCTCTGGGA 360 

GGGTCTCAGC CACTCTCAGA GGCTTATTCA TCTCATCCTC CTTTCCCTCC CCCTTCTTGT 420 

TTTTCAGACT GTCAGCATCA ATAAGGCCAT TAATACGCAG GAAGTGGCTG TAAAGGAAAA 480 

ACACGCCAGA AATATCCTTT GGATGTTGCT TGGAAG 516 



(2) INFORMATION FOR SEQ ID NO: 17: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 2 of HIPl 

(xi) SEQUENCE DESCRPTION: SEQ ID NO: 17: 

TGTTTTCCAT AACCCCCCCT CACCGTGCAT ACTGGGCACC CACCATGAGA AAGGGGCACA 60 
GACCTTCTGG TCTGTTGTCA ACCGCCTGCC TCTGTCTAGC AACCCAGTGC TCTGCTGGAA 120 
GTTCTGCCAT GTGTTCCACA AACTCCTCCG AGATGGACAC CCGAACGTGA GTTCCTGGGG 180 
CTATGGGGTG GCA ' 193 

(2) INFORMATION FOR SEQ ID NO: 1 8: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 3 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

GTGTTCTTTT GCCCCTGCAG GTCCTGAAGG ACTCTCTGAG ATACAGAAAT GAATTGAGTG 60 
ACATGAGCAG GATGTGGGTG AGTTTGGAGA TGTACTCAGG AGCC 104 

(2) INFORMATION FOR SEQ ID NO:20: 
(!) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 327 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 4 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

AATTCCTGGC TGCAGATCTC TTGACTGTTA TGTTCTTGTT GTTGACTCTG TTTCCCCTCC 60 
TCTTCCTAAA AGGGCCACCT GAGCGAGGGG TATGGCCAGC TGTGCAGCAT CTACCTGAAA 120 
CTGCTAAGAA CCAAGATGGA GTACCACACC AAAGTGAGTC TCTGCGGACA GTTCTGCCGC 180 
CACCGCCGCC TCCCCTGCTC CATCCCTTCA GCCCCTCCCT GGGCTCATTT GTCAGCTCTT 240 
TCAGGTAATA GACAGCCCAG GCTTCTGAGG AAGTGTGCAC ATCATGTACC CAAGCTGTGA 300 
GAGAGGAAAG CCACCGCCAG GCCCACG 327 



(2) INFORMATION FOR SEQ ID NO:21: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 331 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 5 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

GGGCTCAAGC AATCCTCCCA CCTCGGCCTC CCAAGTAGCT GGGACCACAG GCGTGTGCCA 60 
CCACGCCCGG CTGAGAGAGG GCTCTTCATG TCTTCTGCCC TGACTCCCTT CCTCTGCCTC 120 
CCTTCCAGAA TCCCAGGTTC CCAGGCAACC TGCAGATGAG TGACCGCCAG CTGGACGAGG 180 
CTGGAGAAAG TGACGTGAAC AACTTGTAAG TGGCTCCTGC CCTGAGCCCA GGGAGGGAGA 240 
AAGCTTTTGT GAATGCTGAC ACTTCTCATA AGGGTCATGG AGGGCCTGAT GGGGGGAGGC 300 
CGTGGCTGGG ATGGGGACCA AAGCCCCTGG G 331 



(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 6 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

ACTGTCGCTG TCACTGTTGA CTTCACCAGG CTGCATGGCC ATAATACCCA CAAGGCTAAG 60 

ACTTGGAGCT GGAGTTGTGT GTGTGTTTGC GCATGCACAT GAGCATTGGA GACTGGAGTA 120 

GCGTAGAGCG TGGGGGAGGG GACAGGTAAC AGACCGGCCT CAGGCTGTGG AGTGT7VAGCT 180 

CTCTTTCCTC TTGGGTCCAG TTTCCAGTTA ACAGTGGAGA TGTTTGACTA CCTGGAGTGT 240 

GAACTCAACC TCTTCCAAAC AGGTGAGTCT CTTCCCTCCC GTCTAACCCA GGCTCTCATG 300 

GGAACTACCT AATTCCTAGT CCTCCTCTCC CTGCAAAGTG TGCAGCACAA GGGGTAGGAA 360 

AATGGAGACA TTCACACCCC ATCTCTGGTC TCTCCAACCC TCGTGCAGGG AGGGACTGAA 420 

CCTCTTCAGT ATTTTTCTTT TTAAGAGACA AGGTCTCGGC CGGGTGCAGT 470 

(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 565 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 7 of HIPl 

{xi)SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

TCTTCACCTG TTTAATGGGG ATACGTTTAC CTATCTCATG GGAGTGTTGT GAAGGTTAAA 60 

TGAATTAGAT GAGGTAAAGC ACGCACAGAA TCGGTCCTTG GTGTATGTTG GACCCCTGCC 120 

TCTGCCCCTC TGAAGAGGCT GCCTGTAATC CCCTGGCTCT ACCACCTTTC TCCCTCACTT 180 

TTATTTCCTA GTATTCAACT CCCTGGACAT GTCCCGCTCT GTGTCCGTGA CGGCAGCAGG 240 

GCAGTGCCGC CTCGCCCCGC TGATCCAGGT CATCTTGGAC TGCAGCCACC TTTATGACTA 300 

CACTGTCAAG CTTCTCTTCA AACTCCACTC CTGTGAGTAC CGCGGGCCAG ATCTTCTTAC 360 

ATGAGATTCA GGCCAGAGGG AGGATCCCAG CCTGAGGATG TCCCCAGAGA AACGCAGTCC 420 

TTCTCAGTGC CTTTGGCTGT CTGCTTCTGT TCCAAAAGGC CCCGGAGCTT CTGACCATTG 480 

TGAGGATAAA AGAGCAGGGC CCAGGCTTTG GTGACCCCAG TAAAGCCCCT GGCTTGCCAC 540 

TCTTGCGTCC AGTGTTACAG GATCT 565 



(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 8 of HIPl 
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(xi)SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

GGGACAGCTC TAGGCCAGTC GTGGCCCCTG GCAGTGCTGG CCACATGCCC CAGGGTAGCT 60 

GGGCCCCTCC CCCTCGAGAG CCCCGCTGTG GCTTCCCTGC CCTCTGGTCC CCCTCCCCTC 120 

TCACACTCTT TCCAATTTCT TCCAGGCCTC CCAGCTGACA CCCTGCAAGG CCACCGGGAC 180 

CGCTTCATGG AGCAGTTTAC AAAGTAAGTG GTTCAAGTAA CAGGAATGGA GGT 233 

(2) INFORMATION FOR SEQ ID NO:25: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 578 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE; no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exons 9 and 10 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ED NO: 25: 

TGAATCCCAG CACCATGGAG TTTATCTCCT TGACAGCCTG TGCCTTTGGG CTGGGGAGGG 60 

GGCAGGAAAG CCAGGTGGCT GCTCTGTCCC CTACATGGGG CTGATGAAGA CACCCAGCAC 120 

CCCTCAGGTC CTTCTCCACC CCTAGGTTGA AAGATCTGTT CTACCGCTCC AGCAACCTGC 180 

AGTACTTCAA GCGGCTCATT CAGATCCCCC AGCTGCCTGA GGTAAGCATG CCCAACCACA 240 

CACCCTCGGC ACTGCAGAGG CCCCAGGTAC TCTCTTAAGG GCCGGCGGGG CCTGGCAAGC 300 

AAGCACTATT TGAGGATGTG TCTCCGTCTT CAGAACCCAC CCAACTTCCT GCGAGCCTCA 360 

GCCCTGTCAG AACATATCAG CCCTGTGGTG GTGATCCCTG CAGAGGCCTC ATCCCCCGAC 420 

AGCGAGCCAG TCCTAGAGAA GGATGACCTC ATGGACATGG ATGCCTCTCA GCAGGTGAGG 480 

ACCACTTGGG AGAGAAACTT GGCCTTTCCT CTCACCTGCA AGTACAGGGG AGAGGCTGGG 540 

GGAGACCCTG GCCAAAGCCC ATTGACTCTA ACCAGGTT 578 

(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 390 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 11 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

AAAAAAATTT AAAAAATTAA ACAGGTCTGA ACCGTTTAAT TCGAGAAAGG GGGCATTCTC 60 

CCATATCACT CAACTGACCC ACACACAGAA TTCTCTGGCT CTCTGACTTA TTCTCACTCC 120 

TTTTTGGTCA ACCACAGAAT TTATTTGACA ACAAGTTTGA TGACATCTTT GGCAGTTCAT 180 

TCAGCAGTGA TCCCTTCAAT TTCAACAGTC AAAATGGTGT GAACAAGGAT GAGAAGTGAG 240 

TCCAAGCTGG GTTCAAGCAG ATGGTTCAGG AGCTAAGTTA AGCCATGGTC TGCCTCAAAA 300 

CACTAACCAA AGAGGAATTC TTAATGATAC TGGGGCTTCT TAGATACAGA ACATCTTGAA 360 

GGGTTGGGGG CAATGGCTTA TGCCTGTAAT 390 
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(2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 547 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 12 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

AAAATCAATA ACCATGGATT TATGAGTATT AGATTAGTAT CTGGTAACAT TTAGAGTATA 60 
ATTTATGGCA TTTCAAAGAA TTGTCCCCAA ATTAATACCA GCTTTTAATT TCCTCCCCTG 120 
AGCTCACAAT TAAAAACAGA GGGATAGAAG CACTATGAAA GCAAACTCAT TCCCCTTCTC 180 
TTCCCAGGGA CCACTTAATT GAGCGACTAT ACAGAGAGAT CAGTGGATTG AAGGCACAGC 240 
TAGAAAACAT GAAGACTGAG GTATAACTTG GATCTGCTCT GCCTTTGCGC TTCACCAAAA 300 
CACGGTAGAT TTGAATGTTA AATTTGCATC ACACTAGCCA GGCACAGTGG CTCACACCTG 360 
TAATCCTAGC ACTTTGGGAG GCCAAGGCAG GAGGATTACC TGAGGTCGGG AGTTCGAGAC 420 
CAGCCTGGGC AACAGGGTGA AACCCCCGTC TTCAATAAAA ATGCAATAAT TAGCCGGGTG 480 
TGTTGGCAGG CACCTGTAAT CCCAGCTACT CGGGAAGCTG AGGCATGAGA ATTGCTTGAA 540 
CTTGGGA 547 



(2) INFORMATION FOR SEQ ID NO:28: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 436 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 13 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

CCCCCAGCCA CTCTAAAGAG GACCACAATT CCCCGGCCAT CATCCCCTGT TATTGTTGTT 60 
GATTGAGGGG CTCCTAATGA CCAGATGGTC CAACCCTCCT GGGACGTGGA GAGTTGACTT 120 
AGGGGAATCA GGTATTTACT TGGAAGCATG GTAGGACCCG CTTCTCCGGC CCATGCCCGT 180 
GACCCGTGGC AGTGGGCGGT TGGCCTCATG ACCGGAGTCC CCCCACAGAG CCAGCGGGTT 240 
GTGCTGCAGC TGAAGGGCCA CGTCAGCGAG CTGGAAGCAG ATCTGGCCGA GCAGCAGCAC 300 
CTGCGGCAGC AGGCGGCCGA CGACTGTGAA TTCCTGCGGG CAGAACTGGA CGAGCTCAGG 360 
AGGCAGCGGG AGGACACCGA GAAGGCTCAG CGGAGCCTGT CTGAGATAGA AAGTGAGCGG 420 
TGGGTGGGGG CGGGGG 435 

(2) INFORMATION FOR SEQ ID NO:29: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 469 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) A>rn-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 14 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

GACTTGAGCC CAAGGAGGTC AAGGCTGCAG TGAACAGTGA TTGTGCCACT GCACCCCAGC 60 
CTGGGTGACA GAGCAAGACT GTCTCAAAAC AAAACAAGGA GGACCTTCTA GGGACCCTGG 120 
CTCATTGCAA GGAAGGCAAG GGTCCCTGCT AGGTTAGACT CCTCACCTTG GTCCTTTACA 180 
ATACAGGGAA AGCTCAAGCC AATGAACAGC GATATAGCAA GCTAAAGGAG AAGTACAGCG 240 
AGCTGGTTCA GAACCACGCT GACCTGCTGC GGAAGGTAAG ACCCTCAGCC CCTGTCACCA 300 
TCCTGCAGGC CCTGCACCTC TAGGGAGAGA GCGGCTCAGG CCTGTGGCTT CCCCGGGGCC 360 
AGCAACCCCT ACATTGATCT CTAAGGCATT GCCGTCATCT CGGGAACCAC ACCTTTTCAG 420 
GCTTCCTTGC CTCTGTGTCT TGGGCTGTGT CCTGGGTGCC AATCCCATG 469 



(2) INFORMATION FOR SEQ ID NO:30: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 15 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

GGGTAGGAAA GTGATTCCTG TGTCTGACTC TAGGGCACGC ACAGCCTGAG TATGATTGTC 60 
CTAGAAGGAG GATGTCCTCT AAGCCTGGGA TCTCCTGGTT CAAGACACTG TTCTTCTTTT 120 
GCAGAATGCA GAGGTGACCA AACAGGTGTC CATGGCCAGA CAAGCCCAGG TAGATTTGGA 180 
ACGAGAGAAA AAAGAGCTGG AGGATTCGTT GGAGCGCATC AGTGACCAGG GCCAGCGGAA 240 
GGTGAGTGGG ACGAGGAGCA CTCGGGAAAT GAGGGAGGGG GCTGTTGAGT TGGTGGCGGG 300 
GGCTTTGTGG CCTTCTGCTC CATGGGCAGT TCTGTGGGTC GGTTGGCATC ACACAGCAG 359 

(2) INFORMATION FOR SEQ ID NO:31 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 209 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 
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(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 16 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

GTTGATCGCT TGGGACGTTT TTACATTTTT ATATTCTTTG TCACTGTCAC CCAGATCAGA 60 
GTCCCTCTGT TTTTCTTCTC TTTCAGACTC AAGAACAGCT GGAAGTTCTA GAGAGCTTGA 120 
AGCAGGAACT TGCCACAAGC CAACGGGAGC TTCAGGTTCT GCAAGGCAGC CTGGAAACTT 180 
CTGCCCAGGT AAATACCTCC TTTTTTTTT 209 



(2) INFORMATION FOR SEQ ID NO:32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 485 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 17 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 



CCCCCACTGC AATCAGTGTG TCCCCGGGAG GGAATCAGAG TGGCAGGTTA AAGAGCCATC 60 

ACCTTCCCAG TCCTTGCAAC CCGGTGGTGG GTTGGACCTC TGGGAAGTAG GGACTGTTTA 120 

ACTCAACCAG CGTCTCCCTC TTTCCTTGTG GTCACCTTTG CAGTCAGAAG CAAACTGGGC 180 

AGCCGAGTTC GCCGAGCTAG AGAAGGAGCG GGACAGCCTG GTGAGTGGCG CAGCTCATAG 240 

GGAGGAGGAA TTATCTGCTC TTCGGAAAGA ACTGCAGGAC ACTCAGCTCA AACTGGCCAG 300 

CACAGAGGGT CACGGACATG GACACGAGCG AGCACCTGTG AATTCCCACC GAGGGCCTCT 360 

GCGCATGCAC GGAGGCTGGG AGGACCCCGG GGCTGCTGAG AAGGGGTTTG GGGCCTTGGC 420 

CTGATTGTGC AGACATTCTG TAGGTGTAAT GCCAGCAGGC CCTGCATTGC CTGCAGAGTC 480 

CATGA 485 



(2) INFORMATION FOR SEQ ID NO:33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 468 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 18 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

TTACTGGCTT GGACCTCATT GGCCATGACT TGAGCTAAGA TGCTAAGAGC CCCAGCCAGG 60 
TCATCCTGCT CAGGTTCATT ATGGAGTCTA GGGCAGACTC TCACCTCCCT GGACCATTTT 120 
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TAGAATCTAT GTGCCAGCTT GCCAAAGACC AACGAAAAAT GCTTCTGGTG GGGTCCAGGA 180 

AGGCTGCGGA GCAGGTGATA CAAGACGCCC TGAACCAGCT TGAAGAACCT CCTCTCATCA 240 

GCTGCGCTGG GTCTGCAGGT ACACTTGCAA TTGCCCAGCT GGCAGGGGCC AGGTCCTTAC 300 

AGCCTGAGAC TCTGTTGATG TTGAATCTCA TGTGAGACTT AGCTCAGGGG CTCTCAGCCC 360 

AGCAGCATGT CAGCATTACC TTAGGGGCGC CCAGGCCCCA TCCTAGATCA GTTACATGTG 420 

GAAACTCTGT GCATTAGTGC CTATACACTA GTATTTTAGT ATTTTCTT 468 



(2) INFORMATION FOR SEQ ID NO:34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 393 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 19 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



CACTAGTAAG CTCCTCCATT CAGTGCTTAA TTAACGAGGA TGAAGCCAGC TATGAGAACT 60 

TGCTCTGACC TTGCCCTGTG TTCCCTCTCA CAGATCACCT CCTCTCCACG GTCACATCCA 120 

TTTCCAGCTG CATCGAGCAA CTGGAGAAAA GCTGGAGCCA GTATCTGGCC TGCCCAGAAG 180 

GTAAGAATGG CCAAGGACAG TCTCTGTCGG CTAGTGATGG CCAGACAGGG TTCAGAAGCA 240 

CCTGAATGCG GGGATAGTGA CAGGTCCCTC TGCATCAAGA AAGGCATGTA GGCAACTCAT 300 

ACAAGAAAGG CATGTAGGCA ACTCATAAAA CGGGAGGAGA GGGTATGAAA GTGTCACCAT 360 

CAACCAGACC TGAGAAACTT CTCTTTCCAA TCC 393 



(2) INFORMATION FOR SEQ ID NO:35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 421 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 20 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 



GGCCTGCCCA GAAGGTAAGA ATGGCCAAGG ACAGTCTCTG TCGGCTAGTG ATGGCCAGAC 60 

AGGGTTCAGA AGCACCTGAA TGCGGGGATA GTGACAGGTC CCTCTGCATC AAGAAAGGCA 120 

TGTAGGCAAC TCATACAAGA AAGGCATGTA GGCAACTCAT AAAACGGGAG GAGAGGGTAT 180 

GAAAGTGTCA CCATCAACCA GACCTGAGAA ACTTCTCTTT CCAATCCTGG CAGACATCAG 240 

TGGACTTCTC CATTCCATAA CCCTGCTGGC CCACTTGACC AGCGACGCCA TTGCTCATGG 300 

TGCCACCACC TGCCTCAGAG CCCCACCTGA GCCTGCCGAC TGTGAGTACT GGGGCATGAG 360 

GGGCTGTTCA TGGACCAGGG GAGCAGGGGG CCTTTAAAAG TCTCTGTTGG GCCGGGCGCA 420 

G 421 
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(2) INFORMATION FOR SEQ ID NO:36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 498 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 21 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



AGGCCGAGGC AGGAGAATCG CTTGAACTCA GGAGGCGGAG TTTGCAGTGA GCCGAGATGG 60 

CGCCACTGCA CTCCAGCCTG GGCAACAAGA GCGAGACTCC ATCTCAAAAA AAAAGTGTCT 120 

ATTGCCTTGT ATCTCCAGCA CTGACCGAGG CCTGTAAGCA GTATGGCAGG GAAACCCTCG 180 

CCTACCTGGC CTCCCTGGAG GAAGAGGGAA GCCTTGAGAA TGCCGACAGC ACAGCCATGA 240 

GGAACTGCCT GAGCAAGATC AAGGCCATCG GCGAGGTACT TGGAGTAGTA TCATTGAGGA 300 

GCATTGTTAT TCTTCTGGGT GTGCGTGCTG GTGAATGGCC AGGGAATCGG TGATGTTCTG 3 60 

AGCTAGTTCT TTCTGCACTT AGAACTTGAT TCTAGAAAGA GATTGTTAAA ATTGGAAAAT 420 

CTGGCCGGGT GCAGTGATTT ATGCGTGTAA TCCCAGCACT TTGGGAGGCC GAGTCAGGAG 480 

GATCACTTGA GGCTAGAC 498 



(2) INFORMATION FOR SEQ ID NO:37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 427 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 22 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ED NO: 37: 



CCCTGTGGCT TGCAGAAGGT GTTTGCTGGG TGGCCTCCTG CCTTGCCATC TTGTT^GGGT 60 

TACAGATGGC AGAGGAGAAG AGACAGGAGG CCCCAAGGTC AGTTCAGCCT TTGTGATGTG 120 

TTCACAGGAG CTCCTGCCCA GGGGACTGGA CATCAAGCAG GAGGAGCTGG GGGACCTGGT 180 

GGACAAGGAG ATGGCGGCCA CTTCAGCTGC TATTGAAACT GCCACGGCCA GAATAGAGGT 240 

AGGAGGTTCC TGCAGGATCT CCTGAAACGA TGCCTTTGCA GCTGCCCTTC TGCAACACTG 300 

CTCATTAAAC ATGTCACAGT CGTTCATTAA GGCCATGGCA ACCCCCTAAG ACAGAAACCA 360 

GAATTTGCCA GGCACAGTGG CTCATGCCTG TAACCCCAGC ACCTTGGGAG GATCACTTGA 420 

GTCCAGG 427 



(2) INFORMATION FOR SEQ ID NO:38: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 367 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 23 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

CCCCCTG7VAT AGGTTAGAGT CTGGATTCTT TTCTGACTCT CTCAAGAATG TGGGCAGGGA 60 
CTTGGGGACT TCCAGATTCA GGTTTCCCAG CTACCACACG ATGTTGGACT GAAAGTATAG 120 
TAAGACATTA GTGGATCCTT AATATTCAAG GCACATTTAG AAACCATGCT TCTTTTTCAC 180 
AGGAGATGCT CAGCAAATCC CGAGCAGGAG ACACAGGAGT CAAATTGGAG GTGAATGAAA 240 
GGTCGGTCTG AGCGGCATGG TGGGACCTAG GGGAGCAGGA TCTGTCTTCC TGACATTGGT 300 
CTATACTTTG CATACTTATT AGGGAATTAG AGGAGAGCAG TAGCAGCCAC GGGGAAGGGC 360 
TGAGTTG 357 

(2) INFORIVlATION FOR SEQ ID NO:39: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 502 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 24 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CCCCGCAGAA TGTTCCAGCA ACCTCAGCAC CCTTCTTACC TCCCTTTCCC ATTCCAAGCT 60 

TGCCTTTGGC TAGGAGTGGG GAAGAGAACC GTCGTGTTCA TTGATCTTGG ATCTTGATCT 120 

CAGTGTATCC TCGACTTGTT TGTTTGGCAG GATCCTTGGT TGCTGTACCA GCCTCATGCA 180 

AGCTATTCAG GTGCTCATCG TGGCCTCTAA GGACCTCCAG AGAGAGATTG TGGAGAGCGG 240 

CAGGGTGAGC GTGGGTGTGG GCCCTGGGCA GGAAGAGGAG GCATCGGTGA CAGACTCCCG 300 

CTCCAACGGA CTCTGTGATG CTGCCGTCTT ACTCTGTGTG TCCACCTGAG TACAGAGCAG 360 

CCACTCCTGT AGATATCAGC AGAGGCCCTG GGGAGAAGTC AGAGCTCCAG GACCTCCCCA 420 

GAGGGTGGCC AGGCATGTGT CCCAACTCCA GCTCCCTTCG CACAGGCAGA CATTGTTGGA 480 

ACTTGCTGTG GGAGCCCTTT TT 502 

(2) INFORMATION FOR SEQ ID NO:40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 437 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 
Ciii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 25 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

TTTTGGTCTC TGAATCTTCT TCTTTTTTGT AAAATGGGAA TACTAATGCT TATGTCTCAG 60 
AGTTACTATG AC3GATGATTT GGGATAATAT ATGTATAAAA GCACCTGCCA TATAGTACAT 120 
GCTCAATAAA AGGTGGCTAT TACTATTTTT TATTTCCCTA GGGTACAGCA TCCCCTAAAG 180 
AGTTTTATGC CAAGAACTCT CGATGGACAG AAGGACTTAT CTCAGCCTCC AAGGCTGTGG 240 
GCTGGGGAGC CACTGTCATG GTGTAAGTAT CTATTGGTAC CAAGGGTCCT CCCATGACCC 300 
CTCTTCCATT GATCCACTCC AAACAATAGC TAAGGAGGGA AAAAAAAATC TGTCCCTTAG 360 
AAATAAACTA TTGATCAGGA AGTCAATAGG ACCGAGTTTA CAAGGGAGCC TGGCTCTCCC 420 
AGGGGACACA GGGCAGG 



(2) INFORMATION FOR SEQ ID N0:41 : 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 351 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 26 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

GGGAGCCTGG CTCTCCCAGG GGACACAGGG CAGGCAGCCT CCCCTCCCTG TTTAGCCAAG 60 
GGCGATGGGG TGGTCTGGAG GTGGGATTGT GGAGGAGTTG CAGCTCATTT GCCCGTAACC 120 
TAGTCCCTCT TCTCGTTTTC CATCAGGGAT GCAGCTGATC TGGTGGTACA AGGCAGAGGG 180 
AAATTTGAGG AGCTAATGGT GTGTTCTCAT GAAATTGCTG CTAGCACAGC CCAGCTTGTG 240 
GCTGCATCCA AGGTAGGACC TGGCTGGACC TCCTAGGACG CTGGAAGGCC TGGTTAGAGA 300 
GTACTAGGCT AGGTTAAAGA GTACTTGGCT GCGTTAGGCA GTACTTGGCT G 351 

(2) INFORMATION FOR SEQ TD NO:42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 418 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULETYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 27 of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

CTTTTTATAT GATAGATATG TCAGGAGCTG ACTATAGTCA GCAGATTTTG AGAAGCTGAT 60 

TGGTGATTGC CGTTTGGCCC ACATATGTTT GCTAAGAACC ATCAGAGCAA TTATCTGATT 120 

CAGTCCTTGT TGCTCTAGGT GTTGTATGAA CCTAAATCTG CTTTGTCCTG GTAGGTGAAA 180 
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GCTGATAAGG ACAGCCCCAA CCTAGCCCAG CTGCAGCAGG CCTCTCGGGG AGTGAACCAG 240 

GCCACTGCCG GCGTTGTGGC CTCAACCATT TCCGGCAAAT CACAGATCGA AGAGACAGGT 300 

AGCCTTTCCA AAGGGACCCT TTTCTTACCC ACCCTGTTGA GCTCTTCTCT GCATCCTTCC 360 

CTGTGATCCC AACCAAATCC CACAGGACTG TGTCTAAATT CTTTCATATT TTTCATCT 418 



(2) INFORMATION FOR SEQ ID NO:43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 279 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECl]LE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 28 of HIP 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 



TTTCCACAGA GCATTGGCAT TGGCTGCCTC TCAGGTGCCA GTCAGCCAGG GTAGAATTTG 60 

ATGAGACCTT CTTGTTTCCA TCCTTGCAGA CAACATGGAC TTCTCAAGCA TGACGCTGAC 120 

ACAGATCAAA CGCCAAGAGA TGGATTCTCA GGTTAGGGTG CTAGAGCTAG AAAATGAATT 180 

GCAGAAGGAG CGTCAAAAAC TGGGAGAGCT TCGGAAAAAG CACTACGAGC TTGCTGGTGT 240 

TGCTGAGGGC TGGGAAGAAG GTAAGCTGAC TCAAAGGAT 279 



(2) INFORMATION FOR SEQ ID NO:44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3715 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: genomic DNA 

(iii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 
(vi) ORIGINAL SOURCE: 
(A) ORGANISM: human 

(x) FEATURE: exon 29 and partial cds of HIPl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 



AACATAAATT ATCATTGTCT TTTAGGAACA GAGGCATCTC CACCTACACT GCAAGAAGTG 60 

GTAACCGAAA AAGAATAGAG CCAAACCAAC ACCCCATATG TCAGTGTAAA TCCTTGTTAC 120 

CTATCTCGTG TGTGTTATTT CCCCAGCCAC AGGCCAAATC CTTGGAGTCC CAGGGGCAGC 180 

CACAGGACTG CCATTACCCA GTGCCGAGGA CATGCATGAC ACTTCCCAAA GACTCCCTCC 240 

ATAGCGACAC CCTTTCTGTT TGGACCCATG GTCATCTCTG TTCTTTTCCC GCCTCCCTAG 300 

TTAGCATCCA GGCTGGCCAG TGCTGCCCAT GAGCAAGCCT AGGTACGAAG AGGGGTGGTG 360 

GGGGGCAGGG CCACTCAACA GAGAGGACCA ACATCCAGTC CTGCTGACTA TTTGACCCCC 420 

ACAACAATGG GTATCCTTAA TAGAGGAGCT GCTTGTTGTT TGTTGACAGC TTGGAAAGGG 480 

AAGATCTTAT GCCTTTTCTT TTCTGTTTTC TTCTCAGTCT TTTCAGTTTC ATCATTTGCA 540 

CAAACTTGTG AGCATCAGAG GGCTGATGGA TTCCAAACCA GGACACTACC CTGAGATCTG 600 

CACAGTCAGA AGGACGGCAG GAGTGTCCTG GCTGTGAATG CCAAAGCCAT TCTCCCCCTC 660 

TTTGGGCAGT GCCATGGATT TCCACTGCTT CTTATGGTGG TTGGTTGGGT TTTTTGGTTT 720 

TGTTTTTTTT TTTTAAGTTT CACTCACATA GCCAACTCTC CCAAAGGGCA CACCCCTGGG 780 

GCTGAGTCTC CAGGGCCCCC CAACTGTGGT AGCTCCAGCG ATGGTGCTGC CCAGGCCTCT 840 
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CGGTGCTCCA 
GGTCAGGCGG 
TTCCCTCCTA 
AACAGAAAGA 
TGATCACATG 
CTGTCCGTGA 
GAGTGATTCC 
CGTTCCACTT 
GAAGAACCCT 
AACAGCCTAC 
CATCTTTCTG 
GTTACTGACT 
AGAAAGGACA 
CCTGTGGACA 
CAGCACTTTT 
ACAGCGTCTT 
GGATCTGCTC 
TTGATCACTG 
GTGCATTTTC 
CAGACGTGTG 
AATGTTGAAT 
CGTGGCTTCC 
AACATTTCCA 
CCCAATTACC 
AGGCTGAGGT 
GAAACCCCCA 
AATCCCAGCT 
GCAGTGAGCT 
TCAAAAAAAA 
ATTCTCTGCC 
AGACGGGATG 
TCATCTCTAC 
TCAGCTAGTT 
TCAGCCCTGA 
AAAAAAAAGA 
TTCCCACTGC 
AAAATTATCT 
CGACTTAGCC 
GTGTCACATG 
AGAATGGCTA 
TGAAGATGAA 
CTAAAAATGA 
GGGTGAGCGT 
ACAAGAAGTT 
CTCTAGAGAA 
GTTTAATCCC 
GACTAGCCTG 
CATGGTGGCA 



TCTCCGCCTC 
AGCTGCTGAG 
AAGCTGAATC 
GGGACAAGAG 
CTTGAATGGA 
ATGGTCTCTG 
CAACTCCTTT 
TCTACGCAAT 
ATACTTGGTT 
AAAGAGATGC 
CCCGGCTACA 
TGGATCCCAA 
GAGCCAGCGG 
GGATGAGGAC 
CCTCTTCTAG 
CCTTCTTTAT 
CAACAGAATA 
TGAACCAACC 
TAAGTGGGAC 
ACCTCCATAC 
GCTGCTTTCT 
CTGTCCCAGT 
AAAGTGGCTA 
AAGGGGTGGC 
GGTAGGATCA 
TCTCTACTAA 
ACTCAGGAGG 
GAGATCACGC 
AAAAAAATTA 
CAGCCGGGCA 
ATTGCTTGAG 
TAAAATTCAA 
GGGGAGCTAA 
TTGTGCCAGT 
TTCTGTGTCA 
ACTCCAGCCT 
GAATGATCCT 
TGAGTCATAA 
AGGAAGTAGC 
CCCCATTCTC 
GAGCTATTTA 
ACCTGCAAGC 
AAATGGGTCA 
AAAAAACAGC 
AAGATGTTTG 
AGCACTTTGG 
GCCAACATGG 
GGCGCCTATA 



CACACTGACC 
TGACAGCTTT 
CCGGCGGAAA 
GGTTCACACA 
GCTGGTGAGA 
TCAAGCGGGC 
CTGCAGACGT 
TGACAAACCC 
TGCTACCCTT 
TTGGAGCAAT 
TCAGCCTTCA 
AGCAAGGAGA 
CTCCAACTCC 
AGAGGGCACA 
ATGGACCCCA 
CTATAGCAAC 
TTGCTAGGTT 
CCCATCTCCC 
ATTCAAAAAA 
TGGGTTAAGG 
GCTTAGTATT 
TGTGGCAACT 
GTCCTCACTT 
CGGGCACAGT 
CCTGAGGTCA 
AAATACCAAA 
CTGAGACAGG 
CATTGTACTC 
CAAATGGGGC 
CAGTGGCGCA 
CTCAGGAGTT 
AAACAAAATT 
GGTGGGAGAA 
GCACTCCGGC 
GAGCCCAGCC 
GAGTGACAGA 
GTCTCTAAAA 
CGGTTAAGAA 
TGTCAGATGT 
TAGACAAAAT 
ATGACACACC 
CTTCTAAATG 
TGACAAGATG 
TGCATCTGTT 
TTTACATAAG 
GAGGCAGGGG 
TGAAACCCCG 
ATCCCAGCTA 



AAGTGCTGGC 
CCTCAAAAAG 
GCCTCTGTCC 
GCCCAGTTCC 
TCAACAACAC 
CGTCTCCCTT 
CTGCCTTGGC 
GGAAGATCAG 
AGTATTTATT 
CAGAACTTCA 
AGAATCAGAA 
TCATTTGGAG 
TTTCAGCCAC 
TGAACAGCTT 
GCATTTAAGT 
TCATTGGTGG 
TTGCTACATG 
TAGCCCACCC 
CTCTCTCCCA 
AAGTATCAGC 
TTTTTGATTC 
AAACCAATCG 
CTAGATCTCA 
GGCTCACGCC 
GGAGTTCAAG 
AATTAGCCGA 
AGAATCACCT 
CAGCCTGGGC 
AAACAGTCTA 
TGCCTGTAAT 
TGAGACCAGG 
AGCCGGGCAT 
TTGCTTGAGC 
CTGGGTGACA 
CAGGAGTTTG 
GCGAGACTCC 
AGAAGCCACA 
AGCACTTAAA 
CACATAATTA 
CAAATTGTCC 
TTGGATTAAA 
AGTCACTGAG 
GGACAGCAAC 
ACTTAAGTTT 
AGAAAGAAGG 
CGGGTGGATC 
TCTCTACTAA 
CTGGGGAGGC 



CCACCCAGTC 
CAGAAGGAGA 
GCCTTTACAA 
CGTGACGAGG 
TACTTCCCTG 
GGCCCAGAGA 
ATCCTCTTGA 
ATGCAATTGC 
ACTAACCTCC 
GGTGTGACTC 
GAAAGCCAAG 
CTCTTGGGTC 
ATGCCCCAGG 
GCCAGGGATG 
GACCTTCTGA 
TAGCCATCAA 
ACGGGTTGTG 
CCCTCCCCAA 
GGACCTCGGA 
ACTAGAAATT 
AAGGCTCAGA 
GTGTGTTCTT 
GCCATTCTAA 
TGTAATCCCA 
ACCAGCCTGT 
GCGTAGTGAC 
GAACCCCAGA 
AACAAGAGCA 
GTGTAATGGA 
CCCAGAACTT 
CTGGGCATCA 
GATGGTGCAT 
TTGGGAAGTC 
GAGTGAGACC 
AGGCTGCAAT 
ATCTCTTTAA 
GAAATGTTTA 
CAGAAGGAGA 
CTTTCGTAAT 
TATTGTGACT 
ACGGGAATCA 
CATCACTAGT 
AAAATCATGG 
GTAAGACAGT 
CCAGACATGG 
ACCTGAGGTC 
AAATACAAAA 
TGAGGCAGGA 



CATGCTCCAG 

GTGAGTGCCT 

GGGAGAAGAC 

CTCAAAAACT 

CCGGAATGAA 

CGGAGTGTGG 

ATAGGAAGAT 

TCCCATCAGG 

CTTAAGCAGC 

TAGCAAAGCT 

GTGCTGGACT 

AGAGAAAATG 

CTCTCGCTGC 

GGCAGCCCAA 

TCTTGGGAAA 

GCACTTCCCA 

AGACTTCTGT 

CTCCCTCTCT 

TGACCATACT 

GGGCAGTCTT 

AGGAATGGTG 

GATGCGGGTC 

CTCATATGTT 

GCACTTTGAG 

CCAACATGGT 

GGGTGCCCGT 

GGCAGAGGTT 

AAACTCCGTC 

TCAAATTAAG 

TGGGAGGCCA 

TAGCAAGACC 

GCCTGTAGTC 

GAGGCTGCAG 

CGTGCTCAAA 

GAGCCATGAT 

AAACAAACAA 

AAAACTTCAT 

GGCTAATTCA 

AGCTCAGATT 

CTTCTAAAAA 

CATCTTAAAG 

GACAAGTCTC 

CTTAGGATCG 

GCCCTGAGAC 

TGTCTCACAC 

AGGAGTTCAA 

ATTAGCCGGG 

GAATC 



900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3650 
3715 
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